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Introduction 


This  project  is  concerned  with  theoretical  methods  for  designing  individualized  opti¬ 
mal  strategies  of  breast  cancer  surveillance.  The  problem  of  optimal  cancer  surveil¬ 
lance  is  set  up  as  a  search  for  optimal  scheduling  of  screening  examinations  subject 
to  certain  constraints  on  the  number  and  timing  of  medical  tests.  The  hypothesis 
to  be  tested  is  that  the  efficacy  of  breast  cancer  detection  can  be  enhanced  through 
incorporating  aggregated  family  history  information  into  a  mathematical  model  de¬ 
signed  to  construct  optimal  schedules  of  cancer  surveillance.  The  proposed  methods 
have  been  validated  using  epidemiological  data  on  breast  cancer  from  the  Utah  Pop¬ 
ulation  Data  Base  (UPDB)  linked  to  the  Utah  Cancer  Registry  (UCR).  All  tasks 
included  in  the  Statement  of  Work  have  been  addressed. 


2.  Modeling  cancer  detection 


Let  T  be  the  age  at  tumor  onset,  and  W  the  time  of  spontaneous  tumor  detection 
measured  from  the  onset  of  disease.  Introduce  the  random  variable  (r.v.)  V  to 
represent  tumor  size  at  spontaneous  detection.  Then  V  =  f(W),  where  /  :  [0,  oo)  — » 
[1,  oo)  is  a  deterministic  function  describing  the  law  of  tumor  growth.  It  is  assumed 
that 

(1)  random  variables  T  and  W  are  absolutely  continuous  and  independent; 

(2)  function  /  is  differentiable  and  f  >  0,  with  the  inverse  of  /  denoted  by  g-, 

(3)  the  rate  of  spontaneous  tumor  detection  is  proportional  to  the  current  tumor  size 
with  coefficient  a  >  0. 

The  probability  density  function  (p.d.f.)  of  the  vector  Y  =  (T  +  W,  V)  is  given 
by 

Py («,  v)  =  pT{u  -  g(v))pv(v).  (1) 

In  the  particular  case  of  exponential  tumor  growth  with  rate  A  >  0  we  have 

Py(u,v)  —  (^-e~^v~v,pT{u  —  ,  u  >  0,  1  <  v  <  eXu.  (2) 

A  A 

It  has  been  proven  (Hanin,  2001)  that  the  distribution  (2)  is  identifiable  if  the  den¬ 
sity  pr  for  the  time  of  tumor  onset  is  specified  by  the  Moolgavkar-Venzon-Knudson 
(MVK)  two-stage  model  of  carcinogenesis.  However,  the  shape  of  the  marginal  dis¬ 
tribution  of  tumor  volume  obtained  from  (2)  is  inconsistent  with  actual  observations. 
Therefore,  a  reasonable  generalization  of  the  above  distribution  is  necessary. 

To  make  the  distribution  (2)  more  flexible,  suppose  that  the  process  of  tumor 
growth  is  described  by  the  exponential  law  f(w)  =  eXw,  w  >  0,  with  a  random 
growth  rate  A.  Specifically,  we  assume  that  the  parameter  1/A  is  gamma  distributed 
with  parameters  a  and  b.  Then  we  have 


ocba  ru/lnv 

*p(u,v)  =  — r-r  /  taexp{—[b  +  a(v  —  l)]t}pr{u  —  t  In  v)dt 
r(o)  Jo  .  - 


<yba  ru 

(in -vrW)Io 


b  -I-  a(v  —  1) 

lnu 


(u  -  s)}pT(s)ds, 


(3) 
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for  u  >  0,  v  >  1.  Unfortunately,  no  identifiability  results  on  this  distribution  are 
available  because  the  corresponding  theoretical  problem  is  far  too  difficult  to  ap¬ 
proach.  Once  the  density  pr  of  the  age  at  tumor  onset  T  is  specified  within  a  certain 
parametric  family,  equation  (3)  allows  us  to  compute  p.d.f.  of  the  joint  distribution 
of  age  and  tumor  size  at  detection.  The  above  model  and  its  properties  are  discussed 
at  length  in  our  recent  paper  (Bartoszynski  et  al.,  2001)  included  in  Appendix  1. 


3.  The  estimation  procedure. 


For  practical  purposes,  the  joint  distribution  can  be  reparametrized  to  include  the 
tumor  diameter  rather  than  its  volume.  If  U  =  T  +  W  is  the  time  (age)  at  tumor 
detection  it  is  easy  to  see  that  the  joint  p.d.f.  of  ( U ,  D)  is  given  by 


p(u,  d )  =  Jo  0ae  e[b+a{{do)3  1)]pM(u-36\n(d/d0))d6,  (4) 

where  do  is  the  diameter  of  a  single  tumor  cell  (do  «,  10~3  mm)  and  pm  is  the 
p.d.f.  of  the  MVK  distribution.  The  model  depends  on  six  parameters:  the  rate 
of  detection  a,  the  mean  p  =  a/b  and  standard  deviation  a  =  y/a/b  of  the  gamma 
distribution,  and  the  identifiable  parameters  A,  B,  p  of  the  MVK  distribution.  The 
following  explicit  formula  holds  for  the  marginal  distribution  of  the  diameter: 


,  .  _  3 d2aaba 

P{  }  ~  4(b  +  a((d/do)3-l)]a+1  ' 


(5) 


It  is  important  to  note  that,  while  three  parameter  appear  in  (5)  they  are  not  all 
identifiable  from  data  on  tumor  diameters.  In  particular,  we  can  rewrite  (5)  in  terms 
of  the  two  parameters  (5  =  ap  and  5  =  p/a  as 

, 3d2/? 

p{  ]  “  «  +  #((£)3  -  DF+1 '  ( 0 

A  direct  maximization  of  the  likelihood  generated  by  formula  (4)  in  the  presence 
of  censoring,  data  truncation,  and/or  missing  size  information,  is  practically  infeasi¬ 
ble.  Each  of  the  situations  mentioned  above  lead  to  the  presence  in  the  likelihood  of 
complicated  double  integrals  (many  thousands  of  them  for  the  datasets  of  interest) 
and  their  computation  is  difficult  to  manage  from  the  point  of  view  of  both  compu¬ 
tation  time  and  error  control.  We  overcome  the  obstacle  by  using  a  Monte  Carlo  EM 
(MCEM)  algorithm,  first  proposed  by  Wei  and  Tanner  (1990)  (see  also  McLachlan 
and  Krishnan,1997;  Chan  and  Ledolter,  1995). 

In  the  standard  EM  algorithm,  the  E  step  consists  in  computing  the  conditional 
expectation  of  the  complete  data  log-likelihood  given  the  observed  data.  In  the 
MCEM  algorithm  the  conditional  expectation  of  the  log-likelihood  of  the  complete 
data  is  estimated  by  averaging  the  conditional  log-likelihoods  of  simulated  sets  of 
complete  data.  The  MCEM  algorithm  does  not  possess  the  same  monotone  conver¬ 
gence  properties  as  the  standard  EM  algorithm,  however  it  is  shown  in  (Chan  and 
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Ledolter,  1995)  that,  under  suitable  regularity  conditions,  an  MCEM  sequence  will, 
with  high  probability,  get  close  to  a  maximizer  of  the  likelihood  of  the  observed  data. 

Let  Y  =  y  be  the  observed  incomplete  data,  Z  =  z  the  missing  data,  and  X  =  x 
the  unobserved  complete  data,  with  x  =  (y,  z).  Let  be  an  arbitrary  element 
in  the  parameter  space,  Eg(-\y)  denotes  the  conditional  expectation  given  Y  =  y 
with  6  treated  as  parameter,  and  Ix(Q')  is  the  log- likelihood  of  X.  Given  a  sample 
2i(0),...  ,  zm(d)  from  p$(z\y),  the  conditional  pdf  of  Z  given  Y  =  y  and  0,  the  MCEM 
algorithm  consists  of  two  steps: 

i  m 

MCEstep  :  Q(ff  |9)  =  -  £ 

171  i=l 

Mstep  :  maximize  Q(-\9). 

A  stopping  rule  for  the  MCEM  algorithm  and  a  discussion  of  the  effect  of  the  size  m 
of  the  sample  Zi(9)  can  be  found  in  (Chan  and  Ledolter,  1995). 

In  our  particular  situation,  the  complete  data  can  be  represented  as  1(()  = 
(Y (£,  £(£)),  Z(£,  £(€)))>  where  S(£)  is  a  discrete  random  variable  that  takes  the  values 
0  when  the  observation  is  censored,  1  in  case  of  a  failure  for  which  we  measure  the 
size  of  the  tumor,  and  2  for  a  failure  when  the  tumor  size  is  not  recorded.  If  we  let 
Tc  be  the  time  of  censoring,  we  can  then  take 

Y((,  0)  =  (Z(0),  Z((,  0)  =  (Tft), 9(5)),  when  5(5)  =  0, 

Y( 5. 1)  =  (£7(5),  D( 5)),  Z( 5, 1)  =  0,  when  5(5)  =  1, 

and 

Y( 5, 2)  =  (£7(5)),  Z(i,  2)  =  (T(5),  9(5)),  when  5(5)  =  2. 

Of  course,  in  the  last  case  knowing  17, T,  and  6,  implies  knowing  W  =  U  —  T  and  D 
from  the  law  of  tumor  growth  (which  is  exponential  in  our  case).  The  log-likelihood 
for  X  can  easily  be  derived. 

The  choice  of  initial  values  for  the  six  parameters  of  the  model  is  reduced  to  a 
one-dimensional  problem  by  providing  a  preliminary  fit  of  the  tumor  size  data  by  the 
marginal  distribution  (6)  and  separately  of  the  age  data  by  the  MVK  distribution. 
This  allows  us  to  obtain  a  starting  point  estimate  for  five  out  of  the  six  parameters 
incorporated  into  the  model.  In  particular  we  can  proceed  by  choosing  the  rate  of 
detection  a  as  the  only  parameter  which  needs  to  be  assigned  an  arbitrary  initial 
value. 


4.  The  data 

The  study  population  consisted  of  people  recorded  in  the  Utah  Population  Database, 
who  were  born  between  1936  and  1941  and  for  whom  follow-up  information  is  avail¬ 
able  that  places  them  in  Utah  during  the  years  of  operation  of  the  UCR.  The  analysis 
was  performed  on  subcohorts  based  on  birth  year.  In  particular  we  looked  at  five 
separate  cohorts  for  female  breast  cancer.  As  the  UCR  has  records  post  1965  only, 
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Table  1:  Data  description. 


Birth  Cohort 

Sample  Size 

#  Failures 

#  Missing  Size  Obs. 

1918-23 

16672 

960 

368 

1924-29 

15032 

804 

300 

1930-35 

12882 

576 

185 

1936-41 

11374 

410 

96 

1942-47 

13437 

333 

79 

there  is  a  left  truncation  effect,  different  for  different  birth  cohorts,  which  is  taken  in 
account  by  our  algorithm. 

The  UCR  has  been  recording  cancer  size  since  1975.  The  latest  data  (post  1987) 
offer  the  tumor  diameter  at  the  time  of  diagnosis  in  millimiters,  while  older  records 
only  give  coarser  intervals.  In  order  to  simplify  the  analysis  we  have  decided  to 
remove  the  grouping  of  the  data  by  treating  them  as  uniformly  distributed  in  each 
interval. 

The  relevant  information  for  each  birth  cohort  in  our  population  is  given  in  Table 
1.  Here  we  denote  by  failures  the  cases  where  we  have  tumor  size  information  and 
report  as  missing  data  the  few  cases  where  the  presence  of  breast  cancer  was  known 
but  no  size  data  was  available. 


5.  Data  analysis 

To  adjust  for  birth  cohort  effects  we  extended  the  model  (4)  to  allow  the  parameter 
p  in  the  MVK  model  be  a  function  of  the  birth  cohort.  This  is  equivalent  to  using 
the  proportional  hazards  model  with  the  birth  cohort  incorporated  as  a  categorical 
covariate.  We  first  applied  our  estimation  procedure  to  one-year  subcohorts  (B36, 
B37,  ...,  B41)  comprising  the  cohort  of  individuals  born  between  1936  and  1941 
(Table  2). 

Table  3  presents  the  maximum  likelihood  estimates  of  model  parameters  that 
result  from  our  estimation  procedure  when  fitting  the  model  to  the  birth  subcohort 
data  described  in  Table  2.  The  resultant  fit  to  the  marginal  distributions  of  U  and 
D  is  showq  in  Figures  1-4.  From  Table  3,  it  is  clear  that  the  parameter  p  does  not 
very  significantly  among  the  sub-cohorts  under  study.  This  observation  allowed  us  to 
group  birth  cohort  data  in  5  year  intervals  in  further  data  analyses  required  to  finalize 
the  project  (see  below).  It  should  be  noted  that  the  likelihood  profile  with  respect 
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Table  2:  Data  description. 


Birth  Cohort 

Sample  Size 

#  Failures 

#  Missing  Size  Obs. 

B36 

1911 

87 

29 

B37 

1932 

73 

16 

B38 

1902 

76 

14 

B39 

1870 

50 

11 

B40 

1896 

63 

20 

B41 

1925 

66 

16 

s 

to  a  is  very  flat,  thereby  deteriorating  the  accuracy  of  estimation  of  the  sensitivity 
parameter  a.  Our  computer  simulations,  conducted  at  different  (fixed)  values  of 
the  parameter  a,  have  demonstrated  that  the  proposed  procedure  produces  good 
estimates  of  the  product  an  and  the  ratio  nla\  even  in  moderate  sample  studies 
these  estimates  are  stable  numerically  and  appear  to  be  fairly  close  to  the  true 
parameter  values  in  a  wide  range  of  a. 

The  same  analysis  was  performed  on  five  separate  birth  cohorts,  each  encom¬ 
passing  a  contiguous  six-year  period  (Table  1).  The  resultant  fit  to  the  marginal 
distributions  of  tumor  diameter  and  age  at  diagnosis  is  shown  in  Figures  5  and  6; 
the  corresponding  parameter  estimates  are  given  in  Table  4.  The  model  provided  an 
excellent  description  of  all  the  cohorts  under  study.  The  mutual  dependence  of  U 
and  D,  as  captured  by  the  expected  tumor  size  conditional  upon  age  at  detection,  is 
also  consistent  with  the  data. 


6.  Optimal  screening  schedules 

The  sequence  of  moments  of  time  assigned  for  medical  exams  and  counted  from  the 
birth  of  a  patient  are  called  a  screening  schedule.  Let  T  be  the  set  of  all  possible 
screening  schedules  r  =  {ri  <  r2  <  ...  <  rn}.  The  set  T  may  be  subject  to  (some  of) 
the  following  restrictions: 

(a)  n  <  no,  where  no  is  an  upper  bound  for  the  number  of  exams; 

(b)  ri  m  and  rn  <  M,  where  m  and  M  are  the  earliest  and  the  latest  times  for 
the  first  and  the  last  exams,  respectively; 

(c)  Tj+i  —  Tj  >  h  >  0  for  all  i  =  1,2,  ...,n  —  1.  This  condition  suggests  a  lower 
bound  h  for  the  minimal  duration  between  any  two  successive  exams. 
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Table  3:  Maximum  likelihood  estimates  of  model  parameters  for  birth  sub-cohorts 
B36-B41 


Parameter 

Estimated  Value 

a 

2.00  10“13 

1.01 

a 

1.19  10'1 

A 

2.51  10"1 

B 

6.63  10~6 

#>(36) 

1.47  10~2 

#>(37) 

1.17  10-2 

#>(38) 

1.28  10“2 

#>(39) 

9.19  10~3 

#>(40) 

1.35  10-2 

#>(41) 

1.39  IQ"2 

t 
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Table  4:  Maximum  likelihood  estimates  of  model  parameters  for  five  birth  cohorts 


Parameter 

Estimated  Value 

a 

2.00  10~13 

1.01 

a 

9.09  10~2 

A 

1.45  10'1 

B 

6.04  lO"5 

p(18-23) 

2.94  10"2 

p(24-29) 

3.65  10"2 

p(30-35) 

3.33  10"2 

p(36-41) 

4.21  10"2 

p(42-47) 

5.11  10~2 

t 
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Fig.  1 


Time  (years) 


Figure  1.  Parametric  versus  non-parametric  estimates  of  the  survivor  function  for 
one  sample  sub-cohort (B36).  The  solid  line  represents  the  Kaplan-Meier  estimate, 
while  the  dotted  line  gives  the  model-based  parametric  estimate.  The  two  curves  are 
practically  identical. 
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Fig.  2 


Figure  2.  Parametric  versus  non-parametric  estimates  of  the  hazard  function  for  one 
sample  sub-cohort  (B36).  The  solid  line  represents  a  local  likelihood  kernel-smoothed 
estimate,  while  the  dotted  line  shows  the  model-based  parametric  estimate. 
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Figure  3.  Parametric  versus  non-parametric  estimates  of  the  tail  function  of  the  tu¬ 
mor  diameter  at  diagnosis  for  the  total  population  studied  (birth  years  1936  through 
1941).  The  solid  line  represents  the  Kaplan-Meier  estimate,  while  the  dotted  line 
gives  the  values  computed  using  our  model. 
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Figure  4.  This  figure  shows  a  histogram  for  the  tumor  diameter  at  diagnosis  of 
the  total  population  studied  (birth  years  1936  through  1941)  and  the  corresponding 
probability  density  predicted  by  the  model. 
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Figure  5.  Distribution  densities  for  tumor  size  at  detection. 
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Figure  6.  Hazard  functions  for  age  at  detection.  Solid  lines:  local  likelihood  kernel- 
smoothed  estimates,  dotted  lines:  model-based  parametric  estimates. 


We  distinguish  between  spontaneous  (incidental)  and  screening  based  tumor  de¬ 
tections.  The  first  occurs  in  the  absence  of  or  concurrently  with  screening  and  is 
thought  of  as  a  continuous  process.  In  contrast  to  this,  screening  based  detection 
is  an  instantaneous  event  that  may  occur  only  at  the  moments  of  the  prescribed 
medical  exams  and  is  therefore  a  discrete  process.  When  both  types  of  detection  are 
present,  they  can  be  viewed  as  competing  risks. 

For  a  discrete  screening  schedule  r  €  T,  we  define  the  efficiency  functional  as  the 
Kantorovich  distance  between  the  tumor  sizes  So  and  S  at  spontaneous  and  com¬ 
bined  detection: 


d(S,So;r)  =  yi  |65o(s)-Gs(s)|ds. 

Since  Gs0(s)  >  Gs(s)  for  all  s  >  1, 

d(S,  So;t)  =  E  {So}  —  E  {S}, 
where  E{.}  stands  for  the  expectation. 

/■ 

y 

Suppose  the  law  of  tumor  growth  is  given  by  fe(t )  and  0  is  non-random.  Then 
<j(s,s„;r)  =  x;  r  t 

i=0  Jri  j=i+ 1 

x  [1  —  e~a^Tj~^]R(Tj  —  t)dGx{t), 


where 


r°°  -  i 

Rd(x)  :=  /  Gw0(w)fg(w)dw,  x>0. 

Jx 

The  extension  of  this  formula  to  the  case  of  random  6  is  straightforward  (see  the 
paper  by  Hanin  et  al.,  2001,  included  in  Appendix  4). 


7.  The  Effects  of  Family  History 

7.1.  Estimation  of  the  hazard  rate 

Proceeding  from  preliminary  studies  of  different  spline  estimation  procedures,  we 
chose  to  model  the  hazard  function  via  quadratic  splines.  A  quadratic  spline  with  m 
knots  specifies  the  hazard  to  be  of  the  form 

2  m 

Am ( t )  =  ^  +  51  7?2 (t  ~  Tj)l  (1) 

{  i=0  j= 1 

where  (a;)+  =  max  (a;,  0).  For  each  birth  cohort,  we  fit  splines  with  knots  which  are 
equally  spaced  in  the  interior  of  the  interval  [Tmin,  Tmax ],  where  Tmin  is  the  minimum 
truncation  age  in  the  cohort  and  Tmax  the  maximum  follow-up  (failure  or  censoring) 
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time.  Restrictions  are  placed  on  the  coefficients  to  ensure  that  A m(t)  remains  positive 
for  all  t.  Thus  with  m  knots  the  number  of  parameters  is  m  +  3.  Models  can  be 
fit  using  maximum  likelihood  techniques  applied  to  the  corresponding  conditional 
likelihood,  as  discussed  in  Boucher  and  Kerber  (2001a). 

We  have  developed  software  designed  to  compute  the  spline  estimates  by  max¬ 
imizing  the  likelihood  function  using  the  algorithm  of  Powell.  We  start  with  one 
knot  and  increase  the  number  of  knots  until  the  fit  is  not  improved,  as  determined 
by  the  likelihood  ratio  test  at  the  significance  level  a  =  0.05.  Three  other  subcohort 
estimates  of  the  hazard  function  were  computed  for  comparison  with  the  spline  esti¬ 
mator;  an  estimator  of  the  life  table  type,  a  Gaussian  kernel  estimate  based  on  the 
Nelson-Aalen  nonparametric  estimator,  and  local  likelihood  estimators  with  differ¬ 
ent  kernels  (uniform,  Epachechnikov,  and  Gaussian).  All  the  estimators  mentioned 
above  are  in  good  agreement  with  each  other  when  applied  to  the  UPDB  data. 

Using  the  computer  programs  developed  in  Year  1,  the  hazard  function  for  can¬ 
cer  incidence  has  been  estimated  from  left  truncated  and  right  censored  data  on 
individuals  identified  through  the  UPDB  and  UCR. 

Although  the  estimates  become  less  reliable  at  increasing  age,  the  hazard  function 
for  breast  cancer  appears  to  be  essentially  non-decreasing  in  all  the  categories  of  all 
familial  measures  considered.  Thus  we  find  no  evidence  of  an  ’’immune  fraction”  in 
this  analysis.  The  curves  for  different  levels  of  risk  appear  not  to  merge  or  cross, 
indicating  that  the  increased  risk  to  those  with  a  family  history  does  not  dissipate 
after  a  certain  age. 

This  study  is  presented  at  length  in  the  paper  by  Boucher  and  Kerber  (2001a) 
included  in  Appendix  2. 


7.2.  Measures  of  Familial  Aggregation  as  Predictors  of  Breast 
Cancer  Risk 

Several  measures  of  familial  disease  aggregation  have  been  proposed,  but  only  a  few 
of  these  are  designed  to  be  implemented  at  the  individual  level.  We  have  evaluated 
four  of  them  in  the  context  of  breast  cancer  incidence.  After  extensive  discussions, 
we  came  to  the  conclusion  that  testing  different  measures  of  family  history  with 
simulated  data  was  not  warranted  in  view  of  the  fact  that  such  a  study  would  have 
added  little  to  the  results  of  real  data  analysis.  Therefore,  we  decided  to  focus  on  a 
more  comprehensive  analysis  of  epidemiological  data  employing  a  wider  spectrum  of 
potential  predictors  of  breast  cancer  risk. 

A  population-based  cohort  consisting  of  114,429  women  born  between  1874  and 
1931  and  at  risk  for  breast  cancer  after  1965  was  identified  by  linking  the  UPDB 
and  the  UCR.  Three  competing  methods  were  used  to  obtain  predictors  of  familial 
aggregation  of  risk:  the  number  of  first  degree  relatives  with  breast  cancer,  the 
posterior  probability  of  carrying  BRCA1  or  BRCA2,  and  the  Familial  Standardized 
Incidence  Ratio  (FSIR),  which  weights  the  disease  status  of  relatives  based  on  their 
degree  of  relatedness  with  the  proband.  Spline  regression  methods  were  used  to 
estimate  the  hazard  function,  stratified  by  measures  of  familial  aggregation. 

We  dichotomized  each  of  our  measures  of  familial  risk,  with  the  high  risk  category 
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representing  approximately  8.5%  of  the  data  in  each  case.  This  was  a  natural  cut 
point,  as  it  represents  the  proportion  of  subjects  with  one  or  more  first  degree  rela¬ 
tives  with  breast  cancer.  The  cutoff  for  FSIR  roughly  corresponds  to  a  relative  risk 
of  two  to  family  members.  The  cut  points  for  the  posterior  probability  of  BRCA1 
and  BRCA2  come  at  points  where  the  posterior  probability  is  rather  small,  less  than 
0.0005  in  both  cases. 

Our  previous  analysis  indicated  that  a  highly  significant  birth-year  effect  exists 
in  the  data,  with  a  women  born  ten  years  later  having  an  estimated  40%  increased 
age-specific  risk.  Birth-year  was  included  as  an  additional  covariate  in  all  regression 
analyses.  The  baseline  risk  was  estimated  using  splines,  with  the  proportional  haz¬ 
ards  model  used  for  birth-year  and  familial  risk.  As  with  most  of  the  models,  we 
found  that  two  knots  were  sufficient  to  provide  an  optimal  fit. 

The  presence  of  a  first  degree  relative  with  breast  cancer  and  the  dichotomized 
FSIR  variable  each  appear  to  be  equally  effective  at  distinguishing  high  risk  sub¬ 
jects,  with  the  high  risk  category  having  about  double  the  risk,  while  the  posterior 
probability  of  BRCA1  and  BRCA2  appear  to  be  less  effective. 

We  performed  a  more  detailed  stratified  analysis  of  FSIR.  The  category  bound¬ 
aries  were  the  approximate  75th,  90th,  and  99.9th  percentiles  of  the  (adjusted)  FSIR 
distribution.  The  upper  category  roughly  corresponds  to  the  reported  fraction  of 
the  general  population  carrying  known  breast  cancer  genes.  Bootstrap  confidence 
bands  were  computed  as  well  as  an  indicator  of  the  reliability  of  the  estimates.  The 
bootstrap  confidence  intervals  are  based  on  100  bootstrap  samples,  except  for  the 
<  75th  percentile  category,  which  is  based  on  20  bootstrap  samples,  because  of  the 
extensive  time  it  took  to  fit  the  models  to  the  large  datasets. 

We  incorporated  the  posterior  probabilities  of  BRCAl  and  BRCA2  and  their  log¬ 
arithms,  as  well  as  log  log  FSIR  as  continuous  variables  in  separate  analyses,  using 
a  proportional  hazards  model  with  birth-year  as  an  additional  covariate.  The  best  re¬ 
sult  (in  terms  of  statistical  significance)  was  obtained  by  including  the  log  log  FSIR, 
where  we  get  a  likelihood  ratio  Xi  =  316.72  (p  <  0.00001). 

We  also  considered  the  indicator  variable  NFIRST  for  presence/absence  of  a 
first  degree  relative,  in  a  proportional  hazards  model.  The  behavior  of  the  hazard 
function  across  different  strata  shows  that  the  proportional  hazards  assumption  is 
not  grossly  violated.  The  variable  NFIRST  was  highly  significant  (likelihood  ratio 
Xi  —  185.6,  p  <  0.0001).  Addition  of  a  second  indicator  variable  for  two  or  more 
first  degree  relatives  with  breast  cancer  did  not  improve  the  likelihood  significantly. 
More  technical  details  on  this  study  are  given  in  the  paper  by  Boucher  and  Kerber 
(2001b)  included  in  Appendix  3. 

7.3.  Individualized  strategies  of  optimal  screening 

In  our  analysis,  the  function  fe(t)  was  taken  to  be  deterministic  exponential  with  rate 
A.  The  reciprocal  of  A  was  assumed  to  be  gamma-distributed  with  shape  parameter 
a  and  scale  parameter  b.  To  evaluate  the  effect  of  family  history,  the  data  were 
stratified  by  FSIR  value  and  all  parameters  of  the  model  were  estimated  from  each 
stratum.  To  prevent  the  strata  from  being  too  small,  we  divided  the  data  in  two 
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Table  5:  Parameter  estimates  obtained  from  the  UPDB  data  stratified  by  FSIR. 


Parameter 

FSIR  <  1 

FSIR  >  1.8 

a 

2  x  10-13 

2  x  10-13 

0.76 

1.08 

a 

0.074 

0.118 

A 

0.130 

0.133 

B 

6.13  x  10~5 

8.72  xlO-5 

p  (1918-23) 

0.034 

0,045 

p  (1924-29) 

0.037 

0.060 

p  (1930-35) 

0.044 

0.064 

p  (1936-41) 

0.057 

0.079 

p  (1942-47) 

0.048 

0.087 

groups  defined  as  FSIR  <  1  and  FSIR  >  1.8.  Using  the  estimation  procedure 
described  in  Sections  4  and  5  we  obtained  estimates  of  the  basic  parameters  for  each 
stratum  (Table  5). 

In  each  stratum,  the  search  for  optimal  screening  schedules  and  optimal  screening 
efficiency  was  conducted  for  a  fixed  number  n  =  10  of  screens  with  no  restriction  on 
the  moments  of  exams.  The  period  of  observation  (horizon)  was  truncated  at  100 
years.  The  method  of  optimization  was  the  exhaustive  search  with  the  step  of  0.25 
years. 

For  both  groups,  our  algorithm  results  in  the  same  optimal  schedule  represented 
by  the  following  sequence  of  screening  ages  (years):  To  =  74.5,  T\  =  78,  T2  =  81,  T3  = 

83.75,  T4  =  86.50,  r5  =  89.25,  r6  =  92,  r7  =  94.75,  t$  —  97.50,  r9  =  100.  For  this 
schedule,  the  intervals  Aj  :=  r*  —  t*_  1,  i  —  l,...,n,  between  two  successive  exams 
are  as  follows:  Ai  =  3.5,  A2  =  3.0,  A3  =  3.0,  A4  =  2.75,  A5  =  2.75,  A6  =  2.75,  A7  = 

2.75,  Ag  =»2.75,  Ag  =  2.5.  Thus  the  structure  of  the  optimal  schedule  appears  to 
be  the  same  for  both  groups  thereby  indicating  that  individualization  of  screening 
schedules  is  not  warranted.  However,  this  unique  optimal  schedule  does  provide  a 
tangible  gain  (relative  to  the  absence  of  screening)  in  terms  of  the  mean  tumor  volume 
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Table  6:  The  percent  decrease  in  the  mean  tumor  volume  under  the  optimal  screening 
schedule  (relative  to  the  absence  of  screening). 


Birth  Cohort 

FSIR  <  1 

FSIR  >  1.8 

1918-23 

0.14% 

0.19% 

1924-29 

0.14% 

0.25% 

1930-35 

0.16% 

0.82% 

1936-41 

0.21% 

0.32% 

1942-47 

0.18% 

0.34% 

at  diagnosis,  the  gain  being  higher  in  the  high  risk  group  ( FSIR  >  1.8)  as  evidenced 
by  the  results  shown  in  Table  6.  The  optimal  schedule  depends  quite  strongly  on  the 
sensitivity  parameter  a  which  needs  to  be  more  reliably  estimated  from  similar  data 
generated  by  randomized  screening  trials.  A  research  paper  summarizing  the  above 
results  is  in  preparation. 


8.  Key  Research  Accomplishments 

Our  key  accomplishments  can  be  summarized  briefly  as  follows: 

•  We  have  developed  computer  programs  implementing  four  statistical  procedures 
for  estimation  of  the  hazard  function;  these  procedures  accommodate  data  subjected 
to  random  truncation  and  censoring. 

•  A  new  method  has  been  developed  for  designing  optimal  schedules  of  breast 
cancer  surveillance  specially  adapted  to  population-based  settings. 

•  Numerical  experiments  have  shown  that  mathematical  and  computational  prob¬ 
lems  of  optimal  cancer  surveillance  are  tractable  within  the  framework  of  the  pro¬ 
posed  model  of  cancer  surveillance  and  detection. 

•  The  joint  distribution  of  age  and  tumor  size  at  diagnosis  has  been  derived 
within  the  framework  of  the  proposed  model  of  the  natural  history  of  breast  cancer. 

•  A  Monte-Carlo  EM  algorithm  has  been  developed  for  estimation  of  the  param¬ 
eters  incorporated  into  the  joint  distribution  of  age  and  tumor  size  at  detection. 

•  The  usefulness  of  the  estimation  procedure  was  evaluated  by  computer  simu¬ 
lations. 
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•  The  estimation  procedure  was  applied  to  epidemiological  data  on  individuals 
identified  through  the  UPDB  and  stratified  by  one  of  the  most  widely  accepted 
indicator  of  family  history  of  breast  cancer  (FSIR).  This  application  provided  values 
of  model  parameters  to  be  used  for  evaluating  potential  benefits  from  individualized 
schedules. 

•  Given  the  estimated  parameter  values  optimal  schedules  have  been  constructed 
using  the  stratified  data  on  breast  cancer.  This  study  has  shown  that  the  optimal 
schedule  of  breast  cancer  screening  is  robust  to  variations  in  familial  risk. 
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10.  Conclusions 

A  version  of  the  Monte  Carlo  EM  algorithm  has  been  developed  for  maximum  like¬ 
lihood  inference  based  on  the  distribution  of  age  and  tumor  size  at  detection.  This 
algorithm  was  tested  by  computer  simulations  and  an  application  to  breast  cancer 
data  obtained  from  the  UPDB  and  UCR  datasets.  Notwithstanding  the  fact  that  the 
likelihood  profile  with  respect  to  the  sensitivity  parameter  a  appears  to  be  very  flat, 
the  proposed  procedure  produces  good  estimates  of  the  product  afj,  and  the  ratio 
n/ a;  these  estimates  are  quite  insensitive  to  specific  values  of  the  parameter  a. 

We  have  explored  several  methods  of  measuring  familial  aggregation  at  the  indi¬ 
vidual  level  as  applied  to  breast  cancer  data.  All  prove  to  be  significant  predictors  of 
individual  risk.  Judging  by  the  difference  in  risk  estimates,  as  well  as  the  likelihood 
ratio  test,  presence  of  a  first  degree  relative  and  FSIR  appear  to  be  better  indicators 
of  increased  risk  than  the  posterior  probability  of  BRCA1  or  BRCA2.  Proceeding 
from  these  results,  we  used  the  simplest  indicator,  namely  the  presence  of  a  first 
degree  relative  with  breast  cancer,  for  the  purposes  of  data  stratification.  We  con¬ 
structed  optimal  schedules  of  cancer  surveillance  (screening)  to  each  data  stratum. 
The  next  step  was  comparing  the  optimal  schedules  thus  obtained  and  evaluate  their 
efficacy  in  terms  of  the  proposed  criterion  of  optimality. 

Given  the  estimated  parameter  values  an  optimal  schedule  has  been  constructed 
using  the  available  data  on  breast  cancer.  This  schedule  provides  a  maximum  reduc¬ 
tion  of  the  mean  tumor  size  at  detection  over  a  set  of  discrete  screening  schedules. 
While  the  efficacy  of  the  optimal  schedule  tends  to  be  higher  in  high  risk  families, 
its  structure  appears  to  be  robust  to  variations  in  breast  cancer  risk.  The  optimal 
schedule  appears  to  depend  quite  strongly  on  parameters  characterizing  the  sensi¬ 
tivity  of  spontaneous  and  screen-based  detection.  More  reliable  estimates  of  these 
parameters  are  needed.  This  is  likely  to  be  accomplished  by  analyzing  similar  data 
generated  by  randomized  screening  trials.  The  question  of  whether  the  expected 
gain  in  the  mean  tumor  size  at  diagnosis  translates  into  a  tangible  survival  benefit 
remains  to  be  addressed  in  future  studies. 


So  What? 

This  study  shows  that  the  efficacy  of  breast  cancer  screening  (in  terms  of  tumor  size 
at  diagnosis)  varies  depending  on  family  history  and  genetic  predisposition  for  breast 
cancer.  However,  this  effect  is  not  sufficiently  strong  to  change  the  structure  of  an 
optimal  schedule  of  breast  cancer  screening  designed  to  maximize  the  reduction  of 
tumor  size  .at  the  time  of  detection.  Further  studies  are  needed  to  establish  a  link 
between  the  effect  of  screening  on  the  distribution  of  tumor  size  at  detection  and 
post-treatment  survival  of  patients  diagnosed  with  breast  cancer. 
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Abstract — The  paper  explores  methodological  and  mathematical  aspects  of  a  new  approach  to 
constructing  optimal  schedules  of  cancer  screening.  This  approach  consists  of  systematic  use  of  tumor 
size  at  detection,  combining  stochastic  models  of  tumor  latency,  tumor  growth  and  tumor  detection, 
and  employing  a  new  biologically  natural  screening  efficiency  criterion  defined  as  the  Kantorovich 
distance  between  the  tumor  size  at  spontaneous  detection  in  the  absence  of  screening  and  the  tumor 
size  at  detection  when  both  spontaneous  and  screening  based  mechanisms  are  in  place.  An  explicit 
formula  for  the  efficiency  functional  is  obtained.  Sample  calculations  suggest  that  in  the  case  of 
exponential  tumor  growth,  the  optimal  screening  schedules  with  a  fixed  number  of  exams  have  a 
trend  to  uniformity.  ©  2001  Elsevier  Science  Ltd.  All  rights  reserved. 


Keywords — Screening,  Optimal  schedules,  Tumor  onset,  Tumor  size,  Carcinogenesis  models. 


1.  INTRODUCTION 

Because  of  the  significant  cancer  incidence  and  progress  of  tumor  detection  technology,  cancer 
surveillance  and  screening  are  becoming  increasingly  important  and  costly  public  health  problems. 
It  is  clear  that  appropriate  mathematical  methods  are  indispensable  for  a  more  effective  manage¬ 
ment  of  the  caseload  through  designing  optimal  surveillance  strategies.  Interest  in  exploring  this 
avenue  has  quickened  in  the  past  few  years  [1-17]. 

The  present  work  discusses  methodological  aspects  of  a  new  approach  to  optimization  of  cancer 
screening  allowing  for  cancer  detection  at  the  earliest  stages  of  tumor  development.  This  makes 
the  chances  of  tumor  cure  more  favorable,  reducing  the  probability  of  tumor  recurrence.  The 

This  work  was  supported  by  Grant  DAMD17-98- 1-8256  awarded  by  the  U.S.  Army  Medical  Research  Acquisition 
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problem  of  optimal  cancer  surveillance  is  set  up  as  a  search  for  optimal  scheduling  of  screens 
subject  to  certain  constraints  on  the  number  and  timing  of  medical  exams.  Problems  of  a  similar 
nature  have  already  been  addressed  in  the  literature.  Yakovlev  and  Tsodikov  [1]  have  developed 
methods  for  constructing  optimal  surveillance  strategies  based  on  the  minimum  delay  time  cri¬ 
terion,  given  that  the  total  number  of  examinations  is  fixed;  see  also  [2].  They  used  dynamic 
programming  methodology  to  solve  the  associated  optimization  problem.  Their  results  show  that 
this  approach  holds  much  promise  for  further  practical  use.  As  one  example,  the  current  practice 
for  the  breast  cancer  post-treatment  surveillance  at  Curie  Institute  (Paris,  France)  is  to  examine 
the  patients  once  per  semester  for  the  first  four  years,  once  per  year  for  the  next  six  years,  and 
once  every  two  years  for  the  remaining  period.  For  this  strategy,  the  estimated  false  negative 
rate  appears  to  be  equal  to  0.2  with  the  mean  delay  of  the  recurrence  detection  4.1  months.  Tak¬ 
ing  advantage  of  a  previously  proposed  parametric  model  of  tumor  recurrence  [3],  the  authors 
constructed  the  optimal  strategy  that  provides  a  33%  reduction  in  the  delay  time,  with  the  tests 
that  comprise  the  optimal  surveillance  schedule  tending  to  be  more  frequent  when  the  hazard  rate 
for  the  time  to  tumor  is  high.  However,  there  are  two  weak  points  in  this  approach.  First,  the 
probability  of  tumor  detection  is  assumed  to  be  independent  of  the  process  of  tumor  regrowth. 
Second,  estimation  of  the  tumor  onset  time  distribution  is  feasible  only  if  a  sample  of  diagnostic 
times  produced  by  a  discrete  surveillance  program  with  known  false  negative  rate  is  available. 
The  same  applies  equally  to  prediagnosis  screening  programs. 

An  alternative  approach  to  the  problem  is  to  minimize  the  average  cost  of  surveillance  account¬ 
ing  for  both  examination  costs  and  costs  of  late  detection  [1,4-15].  Since  the  two  cost  constituents 
are  linked  in  the  optimization  procedure,  the  cost- utility  approach  makes  it  possible  to  search  for 
both  the  optimal  number  of  examinations  and  their  sequence  in  time.  However,  the  costs  of  late 
detection  are  usually  very  difficult  to  evaluate.  For  yet  another  optimization  criterion  based  on 
the  power  of  a  statistical  test  for  mortality  rates,  the  reader  is  referred  to  [1G]. 

Focusing  our  effort  on  possible  medical  rather  than  economic  benefits,  we  propose  to  explore 
a  new  approach  to  the  problem  which  is  based  on  tumor  size  at  detection.  Tumor  size  is  one 
of  the  most  clinically  significant  characteristics  of  tumor  maturity  that  determines  largely  the 
probability  of  both  spontaneous  and  screening  based  tumor  detection.  This  approach  makes  it 
possible  to  utilize  data  on  tumor  size  at  detection  as  an  additional  source  of  information  on  the 
natural  history  of  the  disease;  some  readily  available  epidemiologic  data  obtained  from  the  control 
population  in  the  absence  of  screening  appear  to  be  sufficient  for  estimation  purposes.  Another 
advantage  of  this  approach  is  that  it  offers  a  natural  way  for  incorporating  the  stage  of  tumor 
progression,  where  cancer  detection  normally  occurs,  into  stochastic  models  of  carcinogenesis. 
The  proposed  model  of  tumor  progression  accommodates  a  wide  range  of  deterministic  and 
stochastic  laws  of  tumor  growth. 

As  a  measure  of  the  effect  of  screening,  we  propose  to  use  the  difference  between  the  expected 
tumor  sizes  at  detection  with  and  without  screening,  which  coincides  with  the  Kantorovich  dis¬ 
tance  [18-21]  between  the  distributions  of  the  corresponding  random  variables.  The  structure 
of  this  distance  allows  for  characterizing  the  net  effect  of  screening,  as  compared  to  that  of 
spontaneous  detection. 

Further  advancements  of  the  proposed  approach  to  constructing  optimal  schedules  of  cancer 
screening  will  hopefully  give  answers  to  the  following  questions  of  major  theoretical  and  practical 
importance. 

1.  Is  the  optimal  efficiency  of  screening  high  enough  to  warrant  its  implementation? 

2.  What  is  the  relation  between  the  optimal  screening  schedules  and  their  efficiencies  for  the 
criteria  based  on  the  tumor  size  and  the  expected  time  delay? 

3.  What  are  cancer  specific  patterns  of  optimal  screening  schedules? 

4.  What  is  the  impact  of  hypothesized  laws  of  tumor  growth  on  the  optimal  screening  effi¬ 
ciency  and  the  pattern  of  the  optimal  examination  schedules? 
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5.  What  are  quantitative  characteristics  of  the  initiation,  promotion,  and  progression  stages 
for  specific  cancers? 

The  structure  of  the  present  paper  is  as  follows.  In  Section  2,  we  describe  some  models  of  the 
natural  history  of  cancer  (including  cancer  latency  and  growth),  screening  schedules,  and  cancer 
detection.  Here,  we  also  formulate  basic  assumptions  and  introduce  mathematical  formalism.  An 
explicit  formula  for  the  efficiency  functional  is  derived  in  Section  3.  Sample  numerical  calculations 
and  analysis  of  their  results  are  addressed  in  Section  4. 

2.  BASIC  NOTIONS 


2.1.  Models  of  Carcinogenesis 

In  describing  the  natural  history  of  cancer,  the  process  of  tumor  development  can  be  broken 
down  into  three  stages.  These  stages  are: 

•  formation  of  initiated  cells, 

•  promotion  of  initiated  cells  resulting  in  appearance  of  the  first  malignant  clonogenic  cell, 
and 

•  subsequent  growth  and  progression  of  malignant  tumor. 

The  duration  of  each  stage  of  carcinogenesis  is  thought  of  as  a  random  variable  (r.v.).  In  our 
sample  calculations  presented  in  Section  4,  we  use  a  two-parameter  gamma  family  to  specify 
the  distribution  of  the  length  of  the  first  two  stages  of  carcinogenesis.  However,  more  elaborate 
mechanistic  models  of  carcinogenesis  are  available  to  describe  the  time  to  the  event  of  malignant 
transformation.  We  provide  two  examples  of  such  models. 

The  most  widely  accepted  model  of  tumor  latency  is  commonly  referred  to  as  the  Moolgavkar- 
Venzon-Knudson  (MVK)  Model  [22,23],  This  Markovian  two-stage  model  involves  four  param¬ 
eters  that  refer  to  the  rates  of  initiation  of  target  stem  cells  (that  is,  formation  of  primary 
precancerous  lesions),  and  rates  of  division,  death  or  differentiation,  and  malignant  transforma¬ 
tion  of  initiated  cells.  It  was  first  pointed  out  by  Heidenreich  [24]  and  subsequently  by  Hanin 
and  Yakovlev  [25]  and  Heidenreich  et  al  [26]  that  these  four  parameters  are  not  jointly  identifi¬ 
able  from  time-to-tumor  data.  In  the  case  of  constant  parameters,  all  triples  of  their  identifiable 
combinations  were  described  at  length  in  [25].  In  the  latter  case,  the  MVK  model  leads  to  the 
following  explicit  formula  for  the  distribution  of  the  total  duration  T  of  the  first  two  stages,  that 
is,  of  the  time  from  the  birth  of  an  individual  to  the  tumor  onset  [2?, 28], 


FT{t)  :=  Pr(T  >  t)  = 


( a  -I-  b)eat  1 p 
b  4-  ae^+W  ’ 


t  >  0. 


(1) 


Here  a,  6,  p  >  0  are  identifiable  parameters  of  the  model,  Ft  :=  1  —  Ft  is  the  survivor  function 
of  the  r.v.  T,  and  Ft  is  the  cumulative  distribution  function  (c.d.f.)  of  the  r.v.  T. 

Another  model  of  carcinogenesis  was  proposed  by  Yakovlev  and  Polig  in  [29].  According  to 
this  model,  the  hazard  function  <j>  of  the  time  T  of  tumor  latency,  which  is  related  to  the  survivor 
function  by 

FT(t)  =  e~  J o  *<•>*,  t  >  0,  (2) 


is  of  the  form  s 

<A(s)  =  die-9 2  /o  du  [  h(u)f{s  -  it)  du,  s  >  0,  (3) 

Jo 

where  h  is  a  given  time-dependent  rate  of  external  exposure,  /  is  the  probability  density  function 
(p.d.f.)  of  the  tumor  promotion  time,  and  61,62  are  positive  constants.  The  key  feature  of  the 
Yakovlev-Polig  model  is  that  it  allows  for  the  process  of  cell  death  to  compete  with  the  process 
of  tumor  promotion.  Two  particular  cases  of  the  model  referring  to  spontaneous  and  induced 
carcinogenesis  were  employed  in  [30]  and  [31]  to  study  the  distribution  of  tumor  size  under  a 
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threshold  type  mechanism  of  tumor  detection.  Recently,  Hanin  and  Boucher  [32]  found  conditions 
under  which  the  parameters  /,  0 i,  02  of  the  model  given  by  (3)  are  identifiable  from  time-to- 
tumor  observations.  Specifically,  a  general  necessary  condition  for  identifiability  of  model  (3)  is 
given  by  the  following  theorem. 

Theorem  1.  Suppose  that  the  function  h  satisfies  /0°°  h(t)dt  <  oo  and  that ,  for  some  T  >  0, 
h(t)  =  0  for  t  >  T.  If  the  model  is  identifiable  in  a  family  F,  then 

F(T)  >  0,  for  all  F  e  T. 

Definition.  A  family  T  of  absolutely  continuous  probability  distributions  on  R+  is  said  to  be 
graduated  if  for  every  two  distinct  p.d.fs  f,  f  €  T  and  for  every  constant  A  >  0.  there  is  a 
number  r  >  0  (which  may  depend  on  /,/  and  A)  such  that  either  Af(t)  >  f(t)  for  all  t  >  r,  or 
Af(t)  <  f(t)  for  all  t  >  r. 

The  following  result  generalizes  Theorem  1  in  the  case  of  graduated  families. 

Theorem  2.  Suppose  that  h  is  bounded ,  supported  on  [0,T]  for  some  T  >  0,  and  positive 
almost  everywhere  on  [0,  T].  Then  the  model  is  identifiable  in  a  graduated  family  T  if  and  only 
if  F(T)  >  0  for  all  F  €  F. 

2.2.  Tumor  Growth 

The  following  general  functional  form  is  assumed  for  the  tumor  size  (the  number  of  cells  in  a 
tumor)  S: 

S(w)  =  fo(w),  (4) 

where  w  is  the  time  from  the  moment  of  the  onset  of  cancer,  and  0  is  a  parameter  which  may 
be  scalar  or  vector,  deterministic  or  random.  It  is  assumed  that,  for  every  0,  /#  is  a  strictly 
monotonously  increasing  absolutely  continuous  function  such  that  fo(0)  =  1.  For  a  given  0, 
denote  by  go  the  inverse  function  for  /#,  and  set 

pw 

$e{w):=  /  fe{u)du. 

Jo 

Specific  laws  of  tumor  growth  of  primary  interest  are  listed  below. 

(1)  Deterministic  exponential  growth;  in  this  case,  5(tr)  =  eXw ,  where  A  >  0  is  a  constant 
growth  rate;  see  [33]  for  substantiation. 

(2)  Exponential  growth  with  A  thought  of  as  a  gamma  distributed  r.v.  [34]. 

(3)  The  Gompertz  law 

S(w)  =  eA(1-e~"“), 
with  constant  parameters  A,  B  >  0. 

2.3.  Screening  Schedules 

The  sequence  of  moments  of  time  assigned  for  medical  exams  for  a  specific  cancer  and  counted 
from  the  birth  of  a  patient  will  be  called  a  screening  schedule .  Let  T  be  the  set  of  all  possible 
screening  schedules  r  =  {r\  <  r2  <  <  rn}.  The  set  T  may  be  subject  to  (some  of)  the 

following  restrictions: 

(a)  n  <  no,  where  no  is  an  upper  bound  for  the  number  of  exams; 

(b)  r\  >  m  and  rn  <  M,  where  m  and  M  are  the  earliest  and  the  latest  times  for  the  first 
and  the  last  exams,  respectively; 

(c)  t?;+i  —  Ti  >  h  >  0  for  alH  =  1,2, . . .  ,n  -  1  (this  condition  suggests  a  lower  bound  h  for 
the  minimal  duration  between  any  two  successive  exams). 

Other  restrictions  on  the  moments  of  exams  can  also  be  accommodated.  In  the  language  of 
control  theory,  the  set  T  is  referred  to  as  the  set  of  admissible  schedules. 
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2.4.  Tumor  Detection 

We  distinguish  between  spontaneous  and  screening  based  tumor  detections.  The  first  occurs 
in  the  absence  of  or  concurrently  with  screening  and  is  thought  of  as  a  continuous  process.  In 
contrast  to  this,  screening  based  detection  is  an  instantaneous  event  that  may  occur  only  at  the 
moments  of  the  prescribed  medical  exams  and  is  therefore  a  discrete  process.  When  both  types 
of  detection  are  present,  they  can  be  viewed  as  competing  risks. 

Numerous  attempts  have  been  made  to  relate  the  probability  of  detecting  a  tumor  to  its  size  [33- 
37].  Following  [37],  we  assume  that  the  rate  ro  of  spontaneous  tumor  detection  is  proportional 
to  the  current  tumor  size 

r0  =  a0S,  (5) 

where  ao  is  a  positive  constant. 

Let  r.v.s  Wo  and  W\  denote  the  times  of  spontaneous  and  screening  based  detections,  counted 
from  the  moment  of  cancer  onset,  respectively.  Then,  for  the  moment  W  of  combined  detection, 
when  both  detection  mechanisms  are  in  place,  we  have  W  =  min  (Wo,  W\).  Denote  by 

No  =  fo  (Wo)  and  N  =  fe{W)  (6) 

the  corresponding  tumor  sizes  at  spontaneous  and  combined  detection. 

Keeping  in  mind  relation  (2)  between  the  survivor  function  of  an  absolutely  continuous  non¬ 
negative  r.v.  and  its  hazard  rate,  we  derive  from  (5)  that,  in  the  case  of  nonrandom  parameter  0, 

Fw0(w)  —  r°^du  =  e~a°  Jo  foWdu  __  e-a0<Mw\  (7) 


Therefore, 

FNo(n)  =  FWo  ( 9e(n ))  = 

and  hence, 

poo  poo  poo 

ENq  =  1+  FNo{n)dn=  1+  e-<*o*»(go(n))  dn  =  l+  /  e_ot0<Mu)  f'e{u)  du.  (8) 

J 1  J 1  Jo 

If  6  is  a  r.v.,  then  an  additional  integration  in  (8)  with  respect  to  the  distribution  of  0  is  required. 
In  particular,  for  nonrandom  exponential  tumor  growth  with  rate  A,  we  have 

FWo(w)  =  e-(“o/A)(eA"'-1)^  w  >  0> 

Fff0(n)  =  e-<“o/A)(n-l)i  n  >  1? 

and 

EN0  =  1  +  — .  (11) 

a0 

Equation  (10)  suggests  that  in  this  case  the  r.v.  No  has  a  translated  exponential  distribution  with 
parameter  ao/X.  If  A  is  a  r.v.  which  is  gamma  distributed  with  parameters  p,  v ,  then  it  follows 
from  (11)  that 

EN0  =  l  +  — . 

a0v 

We  now  specify  the  distribution  of  the  r.v.  W\.  Recall  that  W\  is  the  time  of  screening  based 
detection  (in  the  absence  of  spontaneous  detection)  counted  from  the  moment  of  appearance 
of  the  first  malignant  clonogenic  cell.  Indeed,  the  distribution  of  W\  depends  on  the  selected 
screening  schedule  r  =  {t\  <  r2  <  •  •  •  <  rn}.  For  the  sake  of  convenience,  set  ro  :=  0  and 
rn+i  oo.  It  suffices  to  define,  for  every  t  >  0,  the  conditional  distribution  of  W\  given  that 
T  =  t. 


(9) 

(10) 
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Let  77  <  t  <  77+i >  0  <  i  <  For  4)  <  z  <  -  1  and  i  +  1  <  fc  <  n,  define  the  probability 

Pt.(k)  :=  Pr(VFi  —  Tk  —t\T  =  t)  of  tumor  detection  at  the  kth  screen  given  the  cancer  onset  at 
moment  t,  and  by  Pt(oo)  =  1  —  Y?k=i+i  Pt(k) '  the  corresponding  conditional  probability  that  the 
tumor  is  not  detected  by  screening. 

We  introduce  a  discrete  analogue  of  the  hazard  rate  for  the  screening  based  detection  by 

n 

Pi  =  Y  n(k)5T,-t,  (12) 

where  Sx  stands  for  the  Dirac  measure  at  .t,  and  the  sum  over  the  empty  set  of  indices  is  set,  as 
usual,  to  be  zero.  By  definition,  the  discrete  measure  pt  is  related  to  the  conditional  survivor 
function  of  W\  given  that  T  =  t  through  the  equation 

F\V,  \T=t(w)  =  e_  ,0"  d'll(v\  W  >  0,  (13) 

compare  with  (2).  It  follows  from  (12)  and  (13)  that 


X!  PtU)+Pt(oo) 
j=k-\- 1 


^PtU)  +Pt(oo) 

j=k 


e-r,(k) 


or,  equivalently,  that 


i-  ptU)  = 

j=i+l 


k—  1 

1  -  pt 
j=i+ 1 


- r,(k ) 


For  k  =  i  -P  1,  we  find  from  (14)  that 

l-p,(i  +  l)  =e-r'<i+1). 

More  generally,  iterating  this  argument  we  obtain  that 


j)t(k)  =  e  r,L3 


1  _e-n(A) 


i  +  1  <  k  <  n. 


(14) 


(15) 


Observe  that  this  holds  true  for  all  k  =  1, ... ,  ?z,  if  we  set  Pt.{k)  =  rt(k)  =  0  for  1  <  k  <  i. 

Similar  to  (5),  we  are  assuming  that  the  discrete  rate  of  screening  based  detection  is  propor¬ 
tional  to  the  current  tumor  size 


Ti  (A;)  =  aS  (rfc  —t) ,  i  4- 1  <  k  <  n ,  (16) 

with  some  constant  a  >  0.  Combining  (13),  (12),  and  (16)  with  (4),  we  find  that,  given  any  t 
such  that  Tj  <  t  <  77+1,  0  <  i  <  n  —  1, 

F\y^j>=t{w)  =  e“a^='+1  fo(Tk-t)^  where  Tj  —  t  <w  <  Tj+ 1  —  t,  i  +  1  <  j  <  n.  (17) 

Consider  the  case  of  one  exam  occurring  at  a  moment  r  with  the  detection  probability  p  = 
/;(£,  r)  and  the  discrete  detection  rate  r  =  r(£,  r).  Then  by  (15),  1  —  p  =  e~r .  If  the  probability  p 
is  small,  then  the  rate  r  is  approximately  equal  to  p .  In  particular,  under  assumption  (16), 
the  probability  of  tumor  detection  is  approximately  proportional  to  the  current  tumor  size  p  ^ 
aS(r  —  t).  Klein  and  Bartoszynski  [34]  proceeded  in  their  study  of  breast  cancer  from  a  more 
general  assumption  that  the  probability  of  tumor  detection  is  proportional  to  some  power  of  the 
tumor  size.  Their  estimate  of  this  power  leads,  however,  to  a  value  which  is  very  close  to  1. 
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3*  FORMULA  FOR  THE  SCREENING 
EFFICIENCY  FUNCTIONAL 


We  proceed  from  the  following  two  biologically  natural  assumptions. 

1.  The  r.v.s  Wq  and  T  are  independent. 

2.  For  every  t  >  0,  the  r.v.s  W\  and  Wo  are  conditionally  independent  given  that  T  =  t. 

The  first  assumption  claims  that  the  moment  of  spontaneous  tumor  detection  measured  from 
the  appearance  of  the  first  malignant  clonogenic  cell  is  independent  of  the  prior  duration  of 
tumor  latency.  The  second  assumption  reflects  a  technological  (or  instrumental)  nature  of  both 
detection  processes.  It  states  that,  given  the  moment  of  cancer  onset,  the  two  times  W0  and  Wi, 
at  which  competing  events  of  the  spontaneous  and  screening  based  tumor  detection  may  occur, 
are  independent.  This  statement  immediately  follows  from  the  assumption  that  both  detection 
processes  are  completely  determined  by  the  current  tumor  size  as  a  deterministic  function  of 
time. 

For  an  admissible  screening  schedule  r  G  T,  we  define  the  efficiency  functional  as  the  Kan¬ 
torovich  distance  cLk(No ,  A;  r)  (see  [18,20,21])  between  the  tumor  sizes  N0  and  N  at  spontaneous 
and  combined  detection.  This  quantity  serves  as  a  clinically  natural  measure  of  the  gain  resulting 
from  screening.  It  is  well  known  [19,20]  that 


d(A,A0;r) 


\FNo(n)  -  Fjv(n) 


dn. 


(18) 


It  follows  from  (7),  inequality  W0  >  W,  and  monotonicity  of  the  function  fe  that  the  r.v.  No 
stochastically  dominates  the  r.v.  N  :  FNo  >  FN.  This  leads  to  the  following  alternative  expression 
for  the  efficiency  functional: 

/OO  pOO 

F/v0(n)  dn  -  J  Fjv(n)  dn  =  ENq  -  EN,  (19) 


where  E  stands  for  the  expectation. 

Suppose  that  parameter  6  is  nonrandom.  We  set  n  =  fo{w)  and  condition  upon  the  r.v.  T 
in  (18)  to  obtain 


poo 

d (N,  No ;  r)  =  /  \FWo(w)  -  Fw{w) \  f'e(w) dw 
Jo 

noo 

\FWo(w)  -  Fw \T=t  (' w)\  fg{w)  dw dFT{t ), 


where  Fw\T-t  ls  the  conditional  survivor  function  of  the  r.v.  W  given  that  T  =  t.  Since  W  = 
min(Wo,  Wi),  it  follows  from  our  Assumptions  1  and  2  that 


F\ 


Wo 


Fw\r=t  —  F\ 


Wo 


■  FwoFw^r^t  =  Fw0FWl\T=t- 


Therefore, 

poo  p  OO 

d(N,No;r)=  /  FWl\T=t{w)FWo{w)fe{w)dwdFT(t).  (20) 

Jo  Jo 

Observe  that  if  T  =  t,  where  n<t  <  ri+i,  0  <  i  <  n,  then  the  only  possible  values  of  the  r.v.  W\ 
are  r,:+1  -  t, . . .  ,rn+ 1  -  t.  More  specifically,  W\  =  Tj  - 1,  i  +  1  <  j  <  n,  if  the  jth  exam  detected 
a  tumor,  and  W\  =  rn+i  -  t  =  oo  if  the  tumor  was  not  detected  in  the  course  of  screening. 
Therefore,  if  t>Tn  or  r*  <  t  <  t,;+i,  0  <  i  <  n  —  1,  and  0  <  w  <  r^+i  —  t,  then  Fyy^x=t(w)  =  0- 
This  allows  us  to  rewrite  (20)  in  the  form 

d{N,N0;r)  =  £  r  t  r  FWl\T=t(w)FWo{w)fe{w)dwdFT{t). 

i= 0^Ti  j=i+\ 
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We  now  recall  the  explicit  expression  (17)  derived  above  for  the  function  FWl \T=t,  and  denote 

rOC 

Ge{x)  -=  FWo{w)f'e(w)dw,  x  >  0, 


to  obtain  finally 


n—  1 


dt,N,N0-,T)  =  Y,  f  E  I1-'"” 

0  j=?+ 1  L 

n— 1  pTi+ 1  n 

=  E  E  e~aE*=)+i 

i=0  •/ri  ?=<+! 


-QEi=,'+i  fo(rk-t ) 


[G0  (rj  -t)  -  Go  {tj+ i  -  /)]  dFr(t) 


(21) 


fo(Tk-t) 


l  _  g-a/rtfr,-*) 


Gfo  -*)  dFr(t). 


In  the  case  when  parameter  0  is  random,  the  right-hand  side  of  (21)  should  be  integrated  addi¬ 
tionally  with  respect  to  the  distribution  of  9. 

If,  in  particular,  fg{w)  —  eXw  with  a  constant  rate  A,  then  invoking  (9)  we  find  easily  that 

Ge(x)  =  ±e-(° o/A)(e--x))  x  >  0. 

ao 

In  this  case,  the  efficiency  functional  (21)  takes  on  the  form 


\  2. _ }  fTi+ 1  71  i 

Ll°  i=0  Jt'  j=i+l 


d(N,N0;r) 
ri-e- 


r(fto/A)(eM^-0 -i)dFT{t)  (22) 


Observe  also  that  (19)  implies 


EN  =  EN0  —  d  (AT,  No;  r) . 

This  allows  for  an  explicit  calculation  of  the  expected  tumor  size  at  combined  detection  on  the 
basis  of  formulas  (8)  and  (21). 

The  problem 

d  (N,  No;t)  max,  r^T,  (23) 

can  be  solved  by  exhaustive  search  with  some  simplification  arising  from  the  special  form  of 
the  dependence  of  the  functional  (21)  on  r.  A  question  of  practical  importance  is  what  are  the 
values  of  the  number  n  of  exams  for  which  the  problem  (23)  is  computationally  feasible.  We  will 
conclude  this  paper,  which  deals  primarily  with  methodological  and  mathematical  aspects  of  the 
problem  of  optimization  of  cancer  surveillance,  with  some  sample  calculations  with  prescribed 
values  of  model  parameters. 

4.  NUMERICAL  EXPERIMENTS 

It  was  assumed  that  the  time  T  to  tumor  onset  is  gamma  distributed  with  the  mean  ^  ==  50 
years  and  the  standard  deviation  a  ~  20  years.  The  graph  of  the  c.cl.f.  Ft  is  shown  in  Figure  1. 
The  law  of  tumor  growth  was  taken  to  be  deterministic  exponential  with  the  rate  A  =  1.6  years-1, 
which  corresponds  to  the  tumor  size  doubling  time  of  approximately  5.2  months.  The  rate  of 
spontaneous  tumor  detection  was  assumed  to  be  ao  =  0.03.  This  value  has  no  relevance  to  any 
actual  data  and  serves  only  a  purpose  of  illustration.  The  graph  of  the  survivor  function  F\y0 
given  by  equation  (9)  is  presented  in  Figure  2.  The  effect  of  one  exam  occurring  after  tumor 
onset  with  the  screening  based  tumor  detection  rate  a  =  0.1  is  shown  in  Figure  3  featuring  the 
survivor  function  of  the  time  W  to  combined  tumor  detection. 
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/j=50,  a -20 


Time,  years 


Figure  1.  The  cumulative  distribution  function  for  the  time  to  tumor  onset. 


0.0  0.5  1.0  1.5  2.0  2.5  3.0  3.5  4.C 

Onset  Time,  years 

Figure  2.  The  survivor  function  for  the  time  to  spontaneous  detection. 

Survivor  function  of  the  time  to  combined  detection 


'O.O  0.5  1.0  1.5  2.0  2.5  3.0  3.5  4. 

Onset  Time,  years 

Figure  3.  The  survivor  function  for  the  time  to  combined  detection  (a  =  0.1) 
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a 

n=10 


Figure  4.  Optimal  screening  efficiency  as  a  function  of  the  parameter  a  given  a  fixed 
number  (n  =  10)  of  examinations  and  a  =  20  years. 


n=10 


Figure  5.  Optimal  screening  efficiency  as  a  function  of  the  standard  deviation,  a ,  of 
the  onset  time  (n  =  10,  a  =  0.1). 


Table  1.  Optimal  screening  schedules. 


n  =  20 

a 

a 

h 

II 

< 

A2 

a3 

a4 

a5 

Ae 

A7 

A8 

20 

0.1 

27.75 

2.25 

2.00 

2.00 

2.00 

2.00 

2.00 

2.00 

2.00 

2.25 

An 

A12 

A13 

A14 

A15 

Ai6 

A17 

Ais 

A20 

2.25 

2.25 

2.25 

2.25 

2.25 

2.25 

2.25 

2.25 

2.25 

2.25 

n  =  10 

a 

a 

C 

II 

<r 

a2 

a3 

a4 

As 

A7 

As 

a9 

A10 

20 

0.1 

35.00 

2.50 

2.50 

2.50 

2.25 

2.50 

2.50 

2.50 

2.50 

2.50 

20 

0.3 

35.00 

2.50 

2.50 

2.50 

2.50 

2.50 

2.50 

2.50 

2.50 

2.50 

20 

0.5 

34.50 

2.75 

2.75 

2.50 

2.25 

2.50 

2.50 

2.50 

2.75 

2.75 

10 

0.1 

40.75 

2.00 

2.00 

2.00 

2.00 

2.00 

2.00 

2.00 

2.00 

2.25 

30 

0.1 

27.25 

2.50 

2.75 

2.50 

2.50 

2.50 

2.50 

2.50 

2.50 

2.50 
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Figure  6.  Profiles  of  the  efficiency  functional  (n  =  20,  c  =  20  years,  a  —  0.1). 


The  search  for  optimal  screening  schedules  and  optimal  screening  efficiencies  was  conducted  for 
a  fixed  number  n  of  screens  with  no  restriction  on  the  moments  of  exams  and  for  various  values 
of  a.  The  method  of  optimization  was  the  exhaustive  search  with  the  step  0.25  years.  Parameter 
values  n  =  50  years,  A  =  1.6  years-1,  and  cto  =  0.03  were  fixed  throughout  the  calculations.  For 
n  =  10,  plots  of  the  rescaled  optimal  screening  efficiency  d  with  a  =  20  years  versus  a  and,  for 
a  =  0.1,  versus  <7  are  shown  in  Figures  4  and  5,  respectively.  As  it  could  be  expected,  d  increases 
with  increasing  a  and  decreases  with  increasing  a . 

The  results  of  our  search  for  optimal  screening  schedules  with  n  =  10, 20  and  with  several 
values  of  <7  and  a  are  given  in  Table  1.  For  the  reader’s  convenience,  screening  schedules  are 
represented  by  the  intervals  :=  n  -  Ti_i,  *  =  1, .  •  • ,  n,  between  two  successive  exams.  For  all 
cases  explored,  optimal  screening  schedules  are  uniform  or  very  close  to  such. 

As  a  test  for  optimality  of  a  screening  schedule,  profiles  of  the  efficiency  functional  (22),  with 
n  -  1  moments  of  exams  fixed  at  the  optimal  values  and  the  remaining  one  varying  between  the 
two  fixed  neighboring  moments  of  exams,  were  computed.  For  n  =  20  and  a  =  0.1,  these  profiles 
are  given  in  Figure  6.  For  the  moment  n,  a  clear  cut  maximum  was  observed  (see  Figure  6d), 
while  for  r2o  the  maximum  is  more  flat  (see  Figure  6c).  All  intermediate  moments  of  exams 
t2,  .  •  • ,  rig  demonstrated  a  well-pronounced  parabolic  maximum  (see  Figures  6a  and  6b). 

This  study  shows  that  mathematical  and  computational  problems  of  optimal  cancer  surveillance 
are  tractable  within  the  framework  of  the  proposed  model. 
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Abstract 

This  paper  is  concerned  with  modern  approaches  to  mechanistic  modeling  of  the  process  of  cancer 
detection.  Measurements  of  tumor  size  at  diagnosis  represent  a  valuable  source  of  information  to  enrich 
statistical  inference  on  the  processes  underlying  tumor  latency.  One  possible  way  of  utilizing  this  infor¬ 
mation  is  to  model  cancer  detection  as  a  quantal  response  variable.  In  doing  so,  one  relates  the  chance  of 
detecting  a  tumor  to  its  current  size.  We  present  various  theoretical  results  emerging  from  this  approach 
and  illustrate  their  usefulness  with  numerical  examples  and  analyses  of  epidemiological  data.  An  alternative 
approach  based  on  a  threshold  type  mechanism  of  tumor  detection  is  briefly  described.  ©  2001  Elsevier 
Science  Inc.  All  rights  reserved. 
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1.  Introduction 

Dr  Robert  Bartoszynski  passed  away  on  17  January  1998.  The  biography  of  this  exceptional 
individual  will  be  published  in  a  special  issue  of  Mathematical  and  Computer  Modelling  [1] 
dedicated  to  his  memory.  In  last  months  of  his  life  he  was  developing  new  approaches  to 
stochastic  modeling  of  quantal  response  variables  in  carcinogenesis  and  metastatic  progression 
of  tumors.  Unfortunately,  this  latest  work  remained  unfinished.  The  originality  and  depth  of 
his  thought  lead  us  to  expect  that  new  notable  results  would  have  emerged  from  his  research 
endeavor.  In  this  paper,  Robert’s  friends  and  colleagues  make  an  attempt  to  develop  some  of 
his  most  basic  ideas.  In  doing  so,  we  discuss  a  variety  of  issues  and  problems  that  arise  in  the 
quantitative  description  of  the  process  of  cancer  detection,  not  only  those  of  special  interest  to 
Robert  in  his  last  work.  We  are  convinced  that  Robert  would  have  done  this  differently  and 
most  probably  in  a  much  more  elegant  way.  However,  this  is  the  best  these  authors  could  do 
to  pay  a  tribute  to  one  of  the  brightest  scientists  in  the  field  of  biomathematics  and  biosta¬ 
tistics. 

Bartoszynski  and  co-workers  [2-6]  have  developed  a  new  avenue  in  stochastic  modeling  of 
cancer  detection.  The  basic  idea  behind  their  approach  is  to  relate  the  chance  of  detecting  a  tumor 
to  its  current  size.  This  idea  was  also  explored  by  Kimmel  and  Flehinger  [7]  in  the  context  of  the 
primary  tumor  size  -  metastasis  relationship  in  solid  cancers  and  by  Hanin  et  al.  [8]  in  an  attempt 
to  develop  new  approaches  to  optimal  scheduling  of  cancer  surveillance. 

In  the  present  communication,  we  address  a  wide  spectrum  of  problems  associated  with  sto¬ 
chastic  modeling  of  cancer  detection.  In  Sections  2  and  3,  we  give  an  introduction  to  the  modeling 
techniques  based  on  quantal  response  models.  Marginal  distributions  of  tumor  size  and  age  at 
detection  as  well  as  associated  estimation  problems  are  discussed  in  Sections  4  and  5;  the  joint 
distribution  of  the  two  random  variables  and  their  randomized  counterpart  is  given  in  Section  8. 
Generally  speaking,  explicit  formulas  for  the  marginal  distributions  of  tumor  size  and  age  of  an 
individual  at  detection  are  not  sufficient  to  utilize  completely  the  information  contained  in  the 
corresponding  sample  observations  for  estimation  of  parameters  describing  the  natural  history  of 
the  disease;  one  needs  to  know  their  joint  distribution  in  order  to  develop  pertinent  methods  for 
the  maximum  likelihood  statistical  inference. 

In  Sections  6  and  7,  we  explore  some  identifiability  and  stability  properties  of  the  proposed 
model  of  tumor  detection.  For  the  sake  of  completeness,  an  alternative  model  of  a  threshold  type 
process  of  tumor  detection  is  considered  in  Section  9.  Section  10  explores  similar  ideas  in  rele¬ 
vance  to  metastatic  processes  which  were  also  one  of  the  major  subjects  for  Dr.  Bartoszynski’s 
research  in  the  latest  period  of  his  life. 

We  believe  that  the  basic  idea  behind  the  modeling  techniques  presented  in  this  paper  deserves 
further  exploration. 

2.  Quanta]  response  variables 

Let  Y(t),  t  ^  0,  be  a  stochastic  process,  and  T  be  an  absolutely  continuous  non-negative 
random  variable  (r.v.)  defined  on  the  same  probability  space  and  interpreted  as  time  of  occurrence 
of  a  certain  event. 
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Definition  1.  A  random  variable  T  is  said  to  be  quantal  with  respect  to  the  stochastic  process  Y(t) 
if  there  exists  a  non-negative  risk  function  r(y ),  y  ^  0,  such  that  for  all  t  ^  0,  At  >  0,  and  all 
admissible  y , 

Pr{T  G  [t,t  +  At)\T  >  t  and  Y(t )  =  y}  =  r(y)St  +  o(At), 

where  o(x)  is  a  function  such  that  o(x)/x  — >  0  as  x  — >  0+. 

The  concept  of  quantal  response  variable  was  introduced  and  studied  by  Puri  and  Centuria  [9];  its 
further  analysis  is  due  to  Puri  [10,11],  In  fact,  the  definition  in  [9]  postulated  existence  of  a  more 
general  risk  function  r(y,  t) ;  however,  the  present  work  is  concerned  with  age-independent  case  where 
the  conditional  hazard  function  of  the  r.v.  T  depends  on  time  only  through  the  current  value  of  the 
stochastic  process  Y(t).  Specifically,  it  is  the  important  particular  case  r(y)  =  ay  with  constant  a  >  0 
that  is  the  main  subject  matter  of  this  paper. 

The  above  concept  appeared  in  [9]  as  an  alternative  to  threshold  models  of  biological  effects.  The 
latter  modeling  techniques  are  based  on  the  assumption  that  a  response  of  the  organism  to  a  given 
(pathological)  process  occurs  as  soon  as  the  process  exceeds  some  threshold  level,  which  may  be 
random  or  deterministic  (see  Section  9  for  further  discussion).  Puri  [10]  (see  also  [9])  referred  to  a 
personal  communication  with  LeCam,  who  apparently  was  the  first  to  suggest  a  quantal  response 
model  in  the  context  of  host’s  response  to  microbial  infection. 

Whether  a  quantal  mechanism,  or  a  threshold  mechanism  is  a  more  adequate  description  of  the 
process  of  tumor  detection  is  a  question  that  may  be  impossible  to  answer  in  the  present  state  of 
biological  knowledge.  The  diversity  and  complexity  of  detection  mechanisms  remain  to  be  clearly 
understood.  In  such  situations,  it  is  perhaps  best  to  take  a  pragmatic  approach,  and  choose  that 
option  which  leads  to  more  mathematically  tractable  formulations  of  biologically  meaningful 
problems.  The  main  difference  between  threshold  and  quantal  mechanisms  is  that  in  the  first  case  the 
event  of  interest  always  occurs  at  the  so-called  ladder  point  of  the  process  Y{t),  that  is,  at  a  value 
higher  than  any  previously  attained.  In  contrast,  a  quantal  event  may  occur  at  any  value  of  the 
process  Y  ( t );  it  is  only  assumed  that  the  likelihood  of  the  event  of  interest  depends  on  the  process  Y  ( t ) . 
This  means  that  the  quantal  response  model  is  indeterministic  in  nature;  its  stochastic  character 
remains  even  if  the  process  Y ( t )  is  deterministic. 

A  more  general  quantal  response  model  arises  when  the  hazard  function  of  the  time  T  to  the  event 
of  interest  depends  on  the  past  sample  path  of  the  process  Y ( t )  or  on  functionals  of  the  sample  path. 
Handling  quantal  responses  in  this  case  involves  conditioning  on  a  sample  path  of  the  process  Y (t), 
deriving  the  desired  formulas,  and  then  ‘unconditioning’  the  resultant  expressions.  The  main  diffi¬ 
culty  lies,  naturally,  in  the  last  step.  In  the  opposite  extreme  case  where  Y ( t )  is  a  deterministic  strictly 
increasing  function  of  time,  various  characteristics  of  r.v.  T  can  be  obtained  through  its  hazard 
function  hT(t)  =  r(Y(t)). 


3.  Tumor  detection  as  a  quantal  response  event 

When  modeling  time  of  tumor  detection  as  a  quantal  response  variable  in  accordance  with 
Definition  1,  the  first  problem  is  to  determine  the  form  of  the  risk  function  r(y).  The  most  natural 
approach  is  to  relate  the  chance  of  detecting  a  tumor  to  its  current  size  [2-8].  Although  the  rule 
‘the  larger  the  tumor,  the  more  likely  it  will  become  detected’  appears  unquestionably  valid,  the 
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real  question  is  to  what  characteristic  of  tumor  size  is  the  ‘risk’  of  detection  proportional.  For 
tumors  easily  accessible  to  palpation  (e.g.  skin  cancer),  the  answer  is  obvious:  it  is  the  volume  of  a 
tumor  that  plays  the  role  of  Y(t).  The  same  seems  to  be  true  for  breast  cancer,  although  the 
reasons  for  this  are  much  less  clear.  Indeed,  the  major  technique  of  detecting  breast  tumors  is 
mammography,  where  the  tumor  is  observed  as  a  projection  on  the  plane.  A  priori,  therefore,  one 
could  postulate  that  the  process  Y(t)  in  breast  cancer  detection  should  be  the  surface  of  the 
projection,  that  is  a  quantity  of  the  order  of  [K(/)]2/3,  where  V(t)  is  the  volume  of  the  tumor  at 
time  t.  Klein  and  Bartoszyhski  [6]  tested  the  relationship  Y(t)  =  [F(t)f,  and  estimated  e,  with 
e  =  1  providing  the  best  fit  to  breast  cancer  data.  For  other  tumor  sites,  detection  may  be  related 
to  appearance  and  intensity  of  some  symptoms,  which  may  depend  on  cumulative  effects  of  the 
presence  of  a  tumor,  so  that  7(t)  =  jjj  V(t  —  T)^(r)dr  for  some  suitable  functions  or  measures  k. 
Another  example  of  this  kind  is  provided  by  molecular  markers  of  tumor  growth  that  have  proven 
to  be  quite  useful  in  clinical  practice. 

Suppose  the  growth  of  a  tumor  begins  at  time  t  —  0.  Let  X ( t )  be  the  number  of  tumor  cells  at 
time  t  ^  0  and  Y(0)  =  n0.  In  the  simplest  case  where  Y(t)  —  ocX(t),  a  >  0,  the  following  line  of 
reasoning  illustrates  distinct  advantages  provided  by  the  assumption  that  tumor  detection  is  a 
quantal  response  event.  Suppose  that  the  initial  number  of  tumor  cells  n0  is  non-random  and  the 
growth  of  a  tumor  obeys  the  postulates  of  the  homogeneous  pure  birth  process,  also  known  as  the 
Yule  process  [12],  with  a  constant  birth  rate  X.  Then  for  every  fixed  t  the  random  variable  X(t) 
follows  a  negative  binomial  distribution,  that  is 

Pr{X{t)  =  n}=  (  -  Q~y~no,  n  >  «0, 


Pr{X{t)  =  n}  =  0,  n  <  nQ. 

It  follows  that  the  expectation  and  variance  of  tumor  size  at  time  t  are  equal  to 

£{X(r)}  =  X(0)e;-'  and  Var{Y(r)}  =  Y(0)eA,(e;'' -  1),  (1) 

respectively. 

Let 

Z(t)  :=  X{t)t~h. 

Then,  for  t,  t  ^  0  we  have 

E{Z(t  +  t)|  Z(t)}  =  E{X(t  +  t)e-;(m)|Z(/)} 

=  e“;('+T)£{Ar(r  +  t)|Y(t)} 

=  e^(,+r)Y(0e;-T 

=  Z(t). 

This  argument  shows  that  Z(t)  is  a  martingale.  Consequently,  X(t)e~Xl  converges  almost  surely  to 
a  random  variable,  say  £,  as  t  — ►  oo  [13],  Under  mild  conditions,  a  similar  asymptotic  result  holds 
for  a  more  general  model  of  the  Bellman-Harris  branching  stochastic  process  [14],  It  follows  from 
formulas  (1)  that  £’{£}  =  Var{<^}  =  «0.  According  to  the  widely  accepted  clonogenic  concept  of 
tumor  development,  the  process  of  tumor  growth  begins  with  a  single  malignant  cell,  i.e.  «0  =  1,  in 
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which  case  E{Q  =  Var{£}  =  1,  and  the  random  variable  t  has  the  unit  exponential  distribution 
[15].  Therefore,  the  contribution  of  initial  stochastic  fluctuations  to  the  process  X{t)  is  expected  to 
be  relatively  small  if  the  time  elapsed  from  the  tumor  onset  is  sufficiently  large. 

From  the  practical  point  of  view,  it  would  make  sense  to  disregard  the  fluctuations,  and  treat 
X(t)  as  a  continuous  increasing  deterministic  function  of  time.  Consequently,  the  volume  of  a 
tumor  at  time  t  after  the  event  that  initiates  the  tumor  growth  (generates  the  first  malignant  cell) 
will  be  described  as  ce'^,  where  y  =  l/X  and  c  =  jc(0)  is  the  volume  of  a  single  tumor  cell 
(c  ~  10'9  cm3,  see  [6]). 

There  are  at  least  two  ways  of  generalizing  this  simple  model.  First,  one  may  treat  the  pa¬ 
rameter  y  (or  X)  as  a  random  variable,  thereby  yielding  various  randomized  counterparts  of  the 
basic  model  of  exponential  growth.  The  second  possibility  is  to  go  beyond  the  pure  birth  process 
and  proceed  from  more  complex  laws  of  tumor  growth,  like  the  Gompertz  or  logistic  functions,  to 
generate  a  richer  family  of  models.  In  doing  so,  it  is  a  good  idea  to  construct  an  hierarchical 
(nested)  family  of  models  so  that  a  particular  (minimal)  sub-family  could  be  selected  to  provide  a 
sufficiently  accurate  description  of  real  data.  Whether  or  not  the  above  possibilities  are  feasible 
depends  heavily  on  the  type  of  data  to  be  analyzed. 


4.  Tumor  size  at  detection  and  its  distribution 

In  this  paper  we  confine  ourselves  to  the  process  of  spontaneous  (without  screening)  tumor 
detection.  This  term  refers  to  self-detection  and  incidental  diagnoses  resulting  from  individual 
medical  exams  that  do  not  follow  any  fixed  surveillance  schedule.  However,  it  is  possible  to  extend 
the  model  of  cancer  detection  to  include  discrete  surveillance  (screening)  strategies  or  a  combi¬ 
nation  of  spontaneous  and  screening-based  detection  mechanisms  [8].  It  should  be  kept  in  mind 
that  in  most  cases  the  event  of  spontaneous  tumor  detection  is  a  process  that  takes  a  certain 
amount  of  time  rather  than  an  instantaneous  event.  This  process  normally  consists  of  various 
medical  exams  that  may  or  may  not  be  triggered  symptomatically.  In  what  follows,  spontaneous 
detection  is  thought  of  as  occurring  in  the  course  of  the  exam  that  confirms  inequivocally  the 
tentative  diagnosis,  and  the  time  of  this  exam  is  termed  the  time  of  spontaneous  tumor  detection. 

Suppose  that  the  number  of  tumor  cells,  X(t),  present  in  the  tumor  at  time  t  (measured  from  the 
time  of  tumor  onset,  i.e.  the  event  of  malignant  transformation  of  a  premalignant  initiated  cell)  is 
described  by  a  deterministic  growth  function  of  time,  that  is, 

X(t)=fe(t),  (2) 

where  6  is  a  parameter  which  may  be  scalar  or  vector,  deterministic  or  random.  For  simplicity,  we 
assume  that  the  parameter  6  is  non-random.  It  is  also  assumed  that,  for  every  0 ,  fe(t)  is  an  ab¬ 
solutely  continuous  strictly  monotonically  increasing  function  such  that  fg( 0)  =  1.  For  a  given  9, 
denote  by  i J/e(t)  the  inverse  function  for  fg(t),  and  set 

®8{t):=  [  fe{u)du. 

Jo 

The  next  step  is  to  assume  that  the  rate  r  of  spontaneous  tumor  detection  is  proportional  to  the 
current  tumor  size 
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r(t)  =  aX(t)  =  ctfo{t ), 


(3) 


where  a  is  a  positive  constant,  compared  with  Definition  1 .  Let  W  be  the  time  of  spontaneous 
tumor  detection  measured  from  the  time  of  tumor  onset,  and  let  S  be  the  tumor  size  at  detection. 
Then  r is  the  hazard  function  of  r.v.  W .  Therefore,  the  survival  function,  Gw(w)  =  Pr{W  >  w},  for 
the  random  variable  W  is  given  by 


Gir(w)  =  exp 


pw 

-  /  r(u ) 
Jo 


d  u 


exp  <  —  a 


r  Mu)< 

Jo 


d u  \  = 


(4) 


In  like  manner,  we  derive  the  tumor  size  tail  function,  Fs(s)  =  Pr{S  >  5},  of  the  random  variable 

S=f0{W) 

Fs{s)  =  Gw[Us)}  =  (5) 


In  the  derivation  of  formula  (5),  it  is  assumed  that  f(t)  — >  00  as  t  — »•  00.  If  f{t)  has  a  finite  limit, 
say  d,  then  formula  (5)  is  true  for  0  <  d  and  Fs(s )  =  0  for  s  ^  d. 

The  expected  tumor  size  at  detection  is  equal  to 

/OO  p  00 

Fs(s)ds=l+ c-^Mf'n(u)du.  (6) 


In  the  special  case  of  deterministic  exponential  growth  with  rate  A,  it  follows  from  formula  (5)  that 
the  random  variable  S  has  a  translated  exponential  distribution,  that  is 

Fs(s)  =  expj  - 1($-  1)},  (7) 


and  therefore, 

E{S}  =  1  +-. 

a 

Consider  a  new  random  variable  V  representing  tumor  volume  at  detection.  To  find  the  distri¬ 
bution  of  V  we  specify  the  law  of  tumor  growth  in  volume  units  by  the  function  v(t )  =  cch,  where 
c  is  the  volume  of  a  single  cell.  Then  it  follows  from  (7)  that 

/V{F  >  t>}  =  Fk(^)  =  exp  |  —  c)|,  v  >  c.  (8) 

It  is  important  to  note  that  the  distribution  of  tumor  size  (volume)  at  detection  does  not  depend 
on  the  time  of  tumor  onset. 

The  simplest  way  to  generalize  the  above  model  is  through  allowing  for  random  parameter  0. 
The  random  nature  of  the  parameter  6  can  be  interpreted  as  inter-individual  variability  of  the  law 
of  tumor  growth.  Let  h(6),  6^0,  be  the  prior  distribution  density  of  this  parameter.  From 
formula  (5)  we  immediately  obtain 

poo 

Fs{s)=  /  e-a4>o(Ms))^(0)  d0.  (9) 

Jo 

It  is  interesting  to  mention  that  the  law  of  tumor  growth  can  be  recovered  from  the  distribution  of 
tumor  size  at  detection.  To  see  this,  denote  by  ps(s)  to  be  the  probability  density  function  (p.d.f.) 


R  Bartoszyhski  et  al  I  Mathematical  Biosciences  171  (2001)  113-142 


119 


of  tumor  size  at  detection.  Differentiating  formula  (9)  with  respect  to  s,  solving  for  the  derivative 
of  \l/e,  and  then  integrating  the  resultant  expression,  we  have 


Ps(u )  d  u 
Fs{u )  u  ' 


(10) 


If  one  is  interested  in  tumor  volume  rather  than  the  number  of  tumor  cells  at  detection,  then 
formula  (10)  assumes  the  form 


pr(u )  d u 
Fv{u)  u  ’ 


where  pv{v)  =  c~lps(v/c)  is  the  p.d.f.  of  the  tumor  volume  at  detection  and  Fv(v)  =  Fs{v/c )  is  the 
corresponding  tail  function. 

If  the  growth  of  tumor  volume  is  exponential  with  parameter  X  and  y  =  1  jX  follows  a  gamma 
distribution  with  shape  parameter  a  and  scale  parameter  b  then  by  compounding  (8)  we  find  that 
the  p.d.f.  pv  of  the  tumor  volume  at  detection  follows  a  translated  version  of  the  generalized 
Pareto  distribution 


Pv(v) 


aft 

Cl  -  j~l 


P 

ct  -}-  1 


(v-c) 


-(«+») 


(11) 


where  /?  =  a  (a  +1)/  {be). 

Suppose  we  have  a  sample  v\,...,vn  of  tumor  volumes  at  diagnosis.  The  maximum  likelihood 
estimates  of  a  and  b  can  be  obtained  by  maximizing  the  log-likelihood 


/  =  nloga  +  nlogfi  -  {a  +  1)  J^log[l  +  j B(Vj-c)}  (12) 

/=  i 

expressed  in  terms  of  parameters  a  and  /?  :=  /?/ {a  +  1)  =  a/ {be).  The  equations  for  computing  the 
maximum  likelihood  estimates  are  as  follows: 


1 

n 


1 

1  +  P(vj  ~  c) 


i°§[i + kvJ  -  c)\ 

1  i—i 


=  i, 


a  = 


1 

n 


y.  i°g(i + Puj) 

j= i 


Observe  that  these  equations  always  have  a  solution  ft  =  0,  a  —  oo  which  corresponds,  as  it 
follows  from  (11),  to  the  translated  exponential  distribution.  Examples  show  that  other  solutions 
may  not  exist  or  be  multiple,  see  [16,17]  for  a  more  detailed  discussion.  In  the  case  of  multiple 
roots,  the  usual  regularity  conditions  are  no  longer  sufficient  to  guarantee  consistency  of  the 
maximum  likelihood  estimator,  even  when  it  exists  for  all  n  [18].  A  cure  for  this  difficulty  is  to 
construct  a  ^/inconsistent  estimator  and  then  apply  the  first  step  of  the  Newton-Raphson  iterative 
procedure.  It  can  be  shown  that,  under  the  commonly  invoked  regularity  conditions,  the  resulting 
estimator  sequence  is  consistent,  asymptotically  normal,  and  efficient  [18].  The  simplest  way  to 
find  a  ^-consistent  estimator  is  by  constructing  the  moment  estimator  (based  on  the  first  two 
moments)  or  applying  the  accumulation  method  [19]  associated  with  a  certain  prescribed  partition 
of  the  data  range.  It  should  be  kept  in  mind,  however,  that  the  method  of  moments  for  the 
distribution  (1 1)  is  feasible  only  for  a  >  2.  Direct  maximization  of  the  log-likelihood  function  (12) 
represents  an  alternative  way  of  finding  the  maximum  likelihood  estimators  of  the  parameters 
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incorporated  into  the  generalized  Pareto  distribution  [20],  Since  c  is  at  least  six  orders  of  mag¬ 
nitude  less  than  the  smallest  of  the  vs,  one  may  disregard  it  in  formula  (12),  if  desired.  A  detailed 
study  of  maximum  likelihood  estimation  for  the  distribution  (1 1)  in  comparison  to  the  method  of 
moments  is  given  in  [21]. 

The  way  of  estimation  of  the  parameters  a  and  b  described  above  implies  that  the  coefficient  a  is 
known.  Alternatively,  one  can  apply  the  same  randomization  procedure  not  to  the  parameter 
y  =  1/2  but  to  the  product  ay,  thinking  of  the  latter  as  a  gamma-distributed  random  variable  with 
shape  parameter  a\  and  scale  parameter  b\.  The  resultant  distribution  has  the  following  density: 


Pv(v) 


Cl\  +  1 


where  /?,  =  (a+  \)/(b\c). 

To  see  whether  the  above  randomized  version  of  the  distribution  (8)  really  makes  a  difference 
when  applied  to  epidemiological  data,  we  analyzed  measurements  of  tumor  size  at  detection  in 
1120  patients  with  lung  cancer  (all  clinical  stages)  identified  through  the  Utah  Cancer  Registry. 
These  measurements  are  provided  by  pathological  records.  The  method  of  maximum  likelihood 
with  direct  maximization  of  the  log-likelihood  was  used  to  estimate  the  numerical  parameters 
incorporated  into  models  (8)  and  (13). 

The  two  parametric  estimates  of  the  p.d.f.  of  log  V  are  compared  with  the  corresponding  his¬ 
togram  in  Fig.  1 .  The  results  of  this  analysis  clearly  indicate  that  the  exponential  distribution  (8)  is 
entirely  inconsistent  with  the  data  under  study.  On  the  other  hand,  its  randomized  counterpart 
(13)  provides  a  good  fit  to  the  data. 

Another  way  of  modifying  the  basic  model  is  through  a  different  choice  of  tumor  growth  ki¬ 
netics.  In  particular,  the  Gompertz  law  of  tumor  growth  is  specified  by  the  formula 


A,s2  (0 


with  constant  parameters  ^1,^2  >  0-  The  corresponding  p.d.f.  of  tumor  size  at  detection  is  given 
by 


Ps(s)  =  ■ 


■■[log  &\  log(«5j  —logs)] 


^lO-e^dw 


82(Si  -  log 5) 


for  1  ^  5  <  es 


We  evaluated  the  effect  of  tumor  growth  kinetics  on  the  shape  of  the  p.d.f.  of  tumor  size  by  sample 
computations.  The  p.d.f.  ps  for  the  Gompertz  growth  depends  on  two  parameters  a/d2  and  <5i .  We 
selected  the  following  numerical  values  of  these  parameters:  ol/82  =  1.4  x  10“'°  and  =  25.  To 
see  how  well  can  the  p.d.f.  ps  be  approximated  by  the  exponential  density  (derived  from  formula 
(7))  of  the  form 


P*s(s)  =  Jexp{  ^}’  s>1. 


we  used  the  transformation  Z  =  log  S  and  then  minimized  the  Hellinger  distance  [22]  between  the 
two  densities  with  respect  to  the  parameter  a/2.  The  distance  attained  its  minimum  for 
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Fig.  1.  The  probability  density  function  for  log  V  estimated  from  data  on  the  volume  V  of  lung  cancer  at  diagnosis. 
Dashed  line  represents  the  parametric  estimate  based  on  formula  (8),  solid  line  represents  the  estimate  based  on  the 
more  general  model  (13),  step-wise  curve  is  the  histogram  constructed  from  the  sample  values  of  logK,  where  V  is 
measured  in  mm3. 


ol/X  =  7.67  x  10“",  which  value  was  used  when  computing  the  p.d.f.  of  Z  in  the  case  of  expo¬ 
nential  growth.  Fig.  2  shows  quite  dissimilar  shapes  of  the  distribution  of  tumor  size  at  detection 
for  the  two  different  kinetic  curves  of  tumor  growth. 


Fig.  2.  The  p.d.f.  of  log  S  for  two  different  types  of  tumor  growth.  Solid  line:  exponential  growth,  dashed  line: 
Gompertzian  growth. 
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5.  Age  at  detection  and  its  distribution 

Given  the  tumor  growth  begins  at  time  t  =  0,  the  survival  function  for  the  time  to  detection  is 
specified  by  formula  (4).  In  particular,  for  non-random  exponential  tumor  growth  with  rate  A,  we 
have 


Gw(w)  =  e  ® t>w  w  ^  0.  (14) 

A  randomized  counterpart  of  G^(vv)  may  be  written  as 

poo 

Gw(w)=  g-a^-i  )h{y)dy,  w^O,  (15) 

Jo 

where  h  represents  the  p.d.f.  of  the  parameter  y  =  1/A.  We  used  a  gamma  distribution  for  y  in  our 
analysis  of  the  data  on  lung  cancer  based  on  the  distribution  of  tumor  size  at  detection  (Section  4). 
This  analysis  resulted  in  an  estimate  of  the  shape  parameter  of  the  function  h  that  was  very  close 

to  1,  thereby  suggesting  an  exponential  distribution,  h{y)  =  bexp(-by),  y  >  0,  to  be  used  in  the 

compounding  procedure.  Setting  the  coefficient  a  equal  to  2.3  x  10-10  we  obtained  the  estimate 
b  =  6.9.  The  resultant  distribution  density  pw  is  plotted  in  Fig.  3.  Shown  in  this  figure  are  two 
other  densities  pw  computed  for  two  different  values  of  a.  It  is  clear  that  even  a  two  orders  of 
magnitude  difference  in  the  value  of  a  (given  the  same  value  of  b )  does  not  significantly  change  the 
shape  of  the  function  pw. 

In  describing  the  natural  history  of  cancer,  the  process  of  tumor  development  can  be  broken 
down  into  three  stages.  These  stages  are: 

•  formation  of  initiated  cells; 

•  promotion  of  initiated  cells  resulting  in  appearance  of  the  first  malignant  clonogenic  cell; 

•  subsequent  growth  and  progression  of  malignant  tumor. 

The  duration  of  each  stage  of  carcinogenesis  is  thought  of  as  a  random  variable. 

The  above  classification  suggests  that  the  event  of  malignant  transformation  occurs  at  some 
random  time  T  representing  the  total  duration  of  the  first  two  stages  (initiation  and  promotion)  of 
carcinogenesis.  In  the  case  of  sporadic  carcinogenesis  the  r.v.  T is  measured  from  the  date  of  birth 
of  an  individual.  Let  Gr  be  the  survival  function  of  the  r.v.  T,  and  let  gT  stand  for  its  density. 
There  are  mechanistically  motivated  models  of  carcinogenesis  that  yield  an  explicit  expression  of 
the  survival  function  Gr(t)  or  the  corresponding  hazard  function. 

The  most  widely  accepted  two-stage  model  of  carcinogenesis  is  commonly  referred  to  as  the 
Moolgavkar-Venzon-Knudson  (MVK)  model  [23-25],  The  standard  form  of  this  Markovian 
two-stage  model  involves  four  parameters  (6q,  Ao ,Po,t]o)  that  refer  to  the  rates  of  initiation  of 
target  stem  cells  (that  is,  formation  of  primary  precancerous  lesions),  and  rates  of  division,  death 
(or  differentiation),  and  malignant  transformation  of  initiated  and  promoted  cells.  It  was  first 
pointed  out  by  Heidenreich  [26]  and  subsequently  by  Hanin  and  Yakovlev  [27]  and  Heidenreich  et 
al.  [28]  that  these  four  parameters  are  not  jointly  identifiable  from  time-to-tumor  data.  In  the  case 
of  constant  parameters,  all  triples  of  their  identifiable  combinations  have  been  described  in  [27].  In 
the  latter  case,  the  MVK  model  leads  to  the  following  explicit  formula  for  the  distribution  of  the 
total  duration  T  of  the  first  two  stages,  that  is,  of  the  time  from  the  birth  of  an  individual  to  the 
tumor  onset  [24,29-31] 
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Fig.  3.  The  probability  density  functions  pw  for  the  time  at  detection  (progression  stage  duration)  distribution  at  values 
of  model  parameters  estimated  from  data  on  the  volume  of  lung  cancer  at  diagnosis  and  different  values  of  the  pa¬ 
rameter  oc.  Solid  line:  a  =  2.3  x  10~8,  dashed  line:  a  —  2.3  x  10~10,  dotted  line:  a  =  2.3  x  10-12. 


Gr(t)  :=  Pr{T  >  t)  = 


'  (. a + By 

B  +  Aefi+W 


s 


t  ^  0. 


(16) 


Here  A,  B,  8  >  0  are  the  identifiable  parameters  of  the  model.  In  formula  (16)  we  use  the  fol¬ 
lowing  identifiable  parameterization: 


8  —  9o/ Ao,  A  —  y  (Ao  +  Ho  +  rjo)2  —  4Ao^0  +  {do  +  do  ~  A))) 

B  =  Y  (Ao  +  4Ao/i0  —  (ho  +  do  ~  ^o)- 

Properties  of  the  hazard  function  of  the  r.v.  T  distributed  in  accordance  with  the  survival  function 
(16)  are  described  in  [28], 

Fig.  4  shows  a  typical  p.d.f.  gT  that  corresponds  to  the  survival  function  (16).  In  these  com¬ 
putations,  we  used  the  values  of  A,  B,  and  8  for  lung  cancer  which  were  estimated  by  Luebeck  et  al. 
[31]  from  the  control  group  identified  through  the  Colorado  Uranium  Miners  Cohort.  Specifi¬ 
cally,  we  set  A  =  10-4,5  —  0.1821 ,  <5  =  0.0364  in  these  computations.  Numerical  algorithms  are 
available  for  computing  GT  in  the  case  of  piece-wise  constant  rates  0(),  Ao,  Ho>Vo  [24,32,33], 
Another  model  of  carcinogenesis  was  proposed  by  Yakovlev  and  Polig  [34].  The  key  feature  of 
the  Yakovlev-Polig  (Y-P)  model  is  that  it  allows  for  the  process  of  cell  death  to  compete  with  the 
process  of  tumor  promotion.  According  to  this  model,  the  hazard  function  (j)  of  the  time  T  of 
tumor  latency,  which  is  related  to  the  survival  function  by 
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Fig.  4.  The  probability  density  function  gT  for  the  time-to-onset  distribution  based  on  the  MVK  model  of  sporadic 
carcinogenesis.  Computations  were  carried  out  at  the  following  estimates  of  model  parameters  obtained  from  the 
Colorado  Uranium  Miners  Cohort  [32]:  A  =  1 0~4, B  =  0.1821,  S  =  0.0364. 


|  -  jf  </>(«)dwj, 


Gj{t)  —  exp  <1  —  /  0(w)dw}>,  t^O 
is  of  the  form 

0(m)  =  6\  exp  |  —  02  J  /(*)dxj  J  l(x)p(u—x)  dx,  u  ^  0, 


(17) 


(18) 


where  /  is  a  given  time-dependent  rate  of  external  exposure,  p  is  the  marginal  (with  respect  to  the 
joint  distribution  of  the  promotion  time  and  the  time  to  cell  death  for  initiated  cells)  p.d.f.  of  the 
tumor  promotion  time,  and  Q\  >  0,  02  ^  0  are  constants.  The  usual  practice  is  to  use  a  flexible 
parametric  family  of  distributions  for  the  function  p,  for  example  the  two-parameter  gamma 
distribution. 

The  hazard  function  (j)  has  a  maximum  whenever  62  >  0  and  either  of  the  functions  /  and  p  is 
bounded.  In  the  case  where  J0°°  /(x)dx  is  finite,  this  assertion  was  proven  in  [34].  If  /0°°  l(x)dx  is 
infinite  and  p  is  bounded  (almost  everywhere)  from  above  by  a  constant  C,  we  have 

4>{t)  ^  CQ\  f  J  /(jt)ckjexp  W  /(x)dx|; 

hence  </>(f)  — >  0  as  /  — >  oo.  It  is  easy  to  see  that  (j)  displays  the  same  behavior  if  the  function  /  is 
bounded  from  above.  Since  0(0)  =  0  and  <p(t)  >  0  for  /  >  0,  the  function  0  must  have  a  maxi- 
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mum.  However,  the  situation  is  not  the  same  if  we  set  02  equal  to  zero.  Suppose  in  addition  that 
l(x)  =  1  for  almost  all  x  ^  0,  which  is  a  reasonable  assumption  when  modeling  spontaneous 
carcinogenesis.  Then,  it  follows  from  (18)  that 

(j)(t)  =  6 1  f  p(x)dx, 

Jo 

that  is,  0  is  a  non-decreasing  function  of  time. 

Recently,  Hanin  and  Boucher  [35]  found  conditions  under  which  the  model  given  by  (17)  and 
(18)  is  identifiable  from  time-to-tumor  observations.  The  model  has  been  applied  to  various  sets  of 
experimental  and  epidemiological  data  to  gain  quantitative  insight  into  the  process  of  tumori- 
genesis  induced  by  radiation  [36-42]  and  chemical  carcinogens  [43^17],  The  model  also  explains 
some  peculiarities  in  the  incidence  of  female  colorectal  cancer  [48], 

Both  models  can  be  represented  [27]  by  the  following  general  formula  for  the  survival  function 
Gf. 

GT{t)  =  exp  |  —  0O  J  ^(M)dwj,  (19) 

where  0O  is  the  rate  of  initiation,  and  K  is  the  promotion  time  cumulative  distribution  function. 
More  specifically,  K(u)  defines  the  probability  that  a  cell  initiated  at  time  0  completes  the  pro¬ 
motion  stage  at  or  before  time  u.  The  distribution  K  is  improper  if  a  given  two-stage  model  allows 
for  cell  death  in  the  course  of  tumor  promotion.  Formula  (19)  extends  to  the  case  of  time- 
dependent  initiation  rate  6(t)  as  follows  [38]: 

GT{t)  =  exp  |  —  J  0(t  —  u)K{u )  dw| .  (20) 

Mechanistic  modeling  of  the  process  of  tumor  detection  suggests  a  natural  way  of  incorporating 
the  progression  stage  into  the  existing  mechanistic  two-stage  models  of  carcinogenesis.  When 
considering  some  common  human  malignant  tumors,  it  seems  plausible  to  assume  that  the  event 
of  malignant  transformation  is  rare  and  the  process  of  tumor  detection  is  triggered  by  the  first 
arrival  of  a  malignant  cell.  In  other  words,  once  the  first  malignant  cell  is  generated,  its  subsequent 
development  (progression)  into  an  overt  tumor  is  irreversible  and  the  detection  process  begins,  but 
those  cell  clones  that  may  be  generated  at  later  times  do  not  contribute  to  the  process.  In  this  case, 
the  time  of  tumor  latency  (age  at  tumor  detection)  can  be  represented  as  U  —  T  +  W.  Assuming 
stochastic  independence  between  T  and  W  we  obtain  the  p.d.f.  gv  of  U  as  the  convolution 

gu{t)  =  [  gT{t  -x)pw{x)dx.  (21) 

Jo 

Shown  in  Fig.  5  is  a  plot  of  the  density  gu  computed  in  accordance  with  formula  (21)  at  the 
parameter  values  used  earlier  for  computing  the  densities  pw  and  gT  (Figs.  3  and  4).  In  is  natural 
that  variations  in  the  parameter  a  are  even  less  tangible  than  those  seen  in  Fig.  3. 

However,  the  model  represented  by  convolution  (21)  does  not  work  well  in  settings  where 
multiple  tumors  are  observed.  In  such  settings  (for  one  example,  see  [47]),  the  development  of 
nonlethal  tumors  is  irreversible  and  the  total  number  of  tumors  generated  over  any  time  interval 
(0,  t]  is  recorded.  All  practically  used  two-stage  stochastic  models  of  carcinogenesis  are  essentially 
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Fig.  5.  The  convolution  of  the  p.d.f.s  gr  (see  Fig.  2)  and  pn-  (Fig.  3)  for  lung  cancer  as  a  function  of  the  parameter  a. 
Solid  line:  a  =  2.3  x  10-8,  dashed  line:  a  =  2.3  x  10~10,  dotted  line:  a  =  2.3  x  I0-'2. 

based  on  the  simplifying  assumption  that  the  process  of  initiation  can  be  modeled  as  a  Poisson 
process  with  intensity  6(t).  If  initiated  cells  are  promoted  independently  of  one  another,  the 
number  of  cells  initiated  and  subsequently  promoted  by  time  t  is  a  Poisson  process  with  intensity 

A(t)  =  f  6(t  —  u)K(u)du, 

Jo 

see  [27,34],  Introduce  the  convolution  representing  the  cumulative  distribution  function  of  the 
total  duration  of  promotion  and  progression  stages: 

[K*Gw]{u)=  [  K{u-w)AGw{w ), 

Jo 

where  Gw  is  the  cumulative  distribution  function  of  the  progression  stage  duration.  Here  we 
assume  again  that  the  promotion  and  progression  stages  are  independent.  It  follows  from  the 
assumption  on  the  Poisson  process  of  initiation  events  that  the  survival  function  of  the  time  to  the 
first  tumor  is  given  by 

G(t)  =  exp  j  ~  0{t-u)[K*Gw\{u)&i^, 


see  [27]. 
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The  latter  formula  has  in  fact  much  broader  implications  since  it  describes  the  situation  where 
multiple  events  of  tumor  detection  can  be  interpreted  as  independent  ‘competing  risks’.  Then  it 
follows  that  the  time  to  the  first  event  of  detection  is  given  by  a  minimum  of  a  random  (Poisson 
distributed)  number  of  independent  and  identically  distributed  random  variables  representing  in¬ 
dividual  detection  times  of  multiple  tumors  (or  cell  clones).  Therefore,  it  makes  sense  to  explore  this 
model  as  an  alternative  to  model  (21)  in  a  wider  spectrum  of  applications,  including  lethal  tumors. 


6.  ldentifiability  of  the  distribution  of  the  time  to  detection 


Let  us  consider  now  the  following  problem.  Suppose  that  the  distribution  of  the  time  to  onset  T 
either  belongs  to  the  gamma  family  or  is  specified  according  to  the  MVK  model  by  formula  (16). 
In  both  cases,  parameters  of  the  distribution  of  the  r.v.  T  are  identifiable  from  its  observed  dis¬ 
tribution  [27,35].  The  same  is  true  for  the  Y-P  model  under  some  mild  conditions  formulated  in 
[35]. 

It  immediately  follows  from  formula  (4)  that  the  p.d.f.  of  the  time  to  detection,  pw ,  is  given  by 


Pw(w)  =  af  (w)  exp 


-f 


f(s)  d  s 


w  ^  0. 


(22) 


Assuming  that  tumor  grows  exponentially  with  a  fixed  rate  2,  one  can  find  on  the  basis  of  formula 
(22)  that  the  parameters  a  and  X  are  identifiable  within  this  model  given  the  times  of  spontaneous 
tumor  detection.  A  natural  question  is  whether  the  entire  set  of  parameters  involved  in  the  dis¬ 
tribution  of  the  age  at  spontaneous  detection  is  identifiable.  We  limit  our  consideration  to  the 
model  (21)  representing  the  time  of  tumor  latency  U  as  the  sum  of  the  two  independent  random 
variables  T  and  W.  This  leads  to  the  following  problem. 


Problem.  Let  0>  and  be  two  families  of  probability  distributions  on  [0,  oo).  Is  it  true  that  the 
family  of  convolutions  P  *  G,  where  and  Gsf,  is  identifiable?  In  other  words,  does 

Pi  *  Gx  =  P2  *  G2, 

where  PX,P2€^  and  GUG2€  imply  that  P\  =  P2  and  Gx  =  G2? 

Taking  both  families  to  be  the  set  of  degenerate  distributions  5a,a  ^  0,  and  observing  that 
Sa  *  8b  =  Sa+h,  we  conclude  that  in  general  the  answer  to  this  question  is  negative.  Yet  another 
counterexample  is  given  by  gamma  distributions  r(ai,b)  and  r(a2,  b )  with  the  convolution  being 
equal  to  r{a\  +  a2,  b ).  To  formulate  a  theorem  providing  sufficient  conditions  for  the  positive 
solution  of  the  problem  we  need  the  following  definition,  see  [35], 

Definition  2.  A  family  of  absolutely  continuous  distributions  on  [0, 00)  is  called  graduated  if  for 
every  two  distinct  p.d.f.’s  qx  and  q2  from  this  family  and  for  every  e  >  0  there  exists  a  number 
M  >  0  such  that  either  qx  (x)  ^  sq2(x)  for  all  x  >  M  or  q2(x)  ^  sq\  (x)  for  all  x  >  M. 

In  other  words,  a  family  is  graduated  if  the  limit  at  infinity  of  the  ratio  of  any  two  distinct 
p.d.f.’s  from  the  family  is  either  0  or  infinity.  It  is  easy  to  see  that  the  gamma  family  is  graduated. 
The  same  is  true  for  the  family 
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g«,x(x)  ~  ae;Ue  ^  x  ^  0;  a,  X  >  0,  (23) 

of  p.d.f.’s  of  the  form  (22)  with  f{w)  =  eXw  corresponding  to  the  survival  functions  (14). 

Remark  i.  Observe  that  family  (16)  is  not  graduated,  because  any  p.d.f.  from  this  family  with 
parameters  A,  B,  5  behaves  at  infinity  like  Cexp{— bdt},  where  C  is  a  positive  constant  depending 
on  A,B  and  5,  so  that  the  ratio  of  p.d.f.’s  for  two  distinct  distributions  with  parameters 
and  A2,B2,52  satisfying  B\8\  =  B252  tends  at  infinity  to  a  constant  different  from  0  and  infinity. 
The  same  is  true  for  any  family  of  distributions  that  can  be  represented  in  the  form  of  formula  (19) 
if  the  distribution  K  has  finite  first  moment. 


Theorem  1.  Let  IP  and  *3  be  two  families  of  absolutely  continuous  probability  distributions  on  [0,  oo) 
with  p.d.f’ s  p  €  &  and  g  E  3.  Suppose  that 

1 .  family  IP  is  graduated', 

2.  for  every  p  E  2P,  there  is  M  =  M(p)  >  0  such  that  p{t)  >  0  for  all  t  >  M; 

3.  for  every  p  €  &  and  for  each  s  >  0,  there  exists  a  finite  limit 


hp(s) 


:=  lim 

/— >oo 


Pit  ~  *) 

Pit)  ’ 


4.  for  each  g  €  3, 


£s  l  e^ufgis)  ds = l  /,'(s)s(s)  d'; 

5.  for  all  p  G  0*  and  g  E  3, 

/•OO 

0<  /  hp(s)g(s)  ds  <  oo. 

Jo 


Then  the  family  of  convolutions  P  *  G,  where  P  €  SP  and  G  G  3,  is  identifiable. 


Proof.  Suppose  that  for  some  distributions  P\,P2  €  &  and  G\,  G2  G  3  we  have  P\  *  G\  =  P2*  G2. 
Then 

[  Pi(t-s)gi(s)ds=  [  p2(t  -s)g2(s)ds,  t  Ss  0. 

Jo 

Assuming  that  t  >  M  :=  ma x{M{p\),M{p2)']  we  rewrite  this  equation  in  the  form 

P,{,)1  PJMTlg'{s)as=M,) l  ^Tfl<s)ds’  ,>M ■ 

Passage  to  limit  as  t  — » oo  in  this  equation  with  conditions  (3)-(5)  of  the  theorem  taken  into 
account  yields 

lim  PlW  =  r  hpfs)g2(s)ds 
00  Piit)  /0°°  hPl  (s)gi  (s)  ds ' 

Invoking  conditions  (1)  and  (5)  we  conclude  that  p\  —pi-  Denote  this  function  by  p. 

For  any  p.d.f.  (p  of  a  non-negative  random  variable,  let  ip  be  its  Laplace  transform  defined  by 
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poo 

<p(z)  :=  /  e~zt(p(t)dt,  Rez^O. 

Jo 

Then 

p(z)gi  (z)  =  p{z)g2  (z) ,  Re  z^O.  (24) 

Since  p  is  a  non-zero  analytic  function  in  {Re  z  >  0},  the  set  Q  :=  {t  €  (0,  oo):  p(t)  /  0}  is  open 
and  non-empty.  From  (24)  we  conclude  that  gx  =  g2on  Q  and  hence,  by  the  uniqueness  theorem 
for  analytic  functions,  gx{z)  =  g2(z)  for  all  z  with  Re  z^  0.  Therefore,  by  the  uniqueness  theorem 
for  the  Laplace  transform,  g\  =  g2.  Thus,  Pi  =  P2  and  Gi  =  G2.  Theorem  1  is  proved. 

Remark  2.  For  concrete  families  PP  and  condition  (4)  usually  follows  from  standard  theorems 
about  passage  to  limit  in  the  Lebesgue  integral.  In  particular,  Theorem  1  can  be  applied  in  the  case 
when  PP  is  the  family  of  gamma  distributions  r(a,b )  with  shape  parameter  a  ^  1  and  ^  is  the 
family  (23).  First,  conditions  (1)  and  (2)  of  Theorem  1  are  obviously  satisfied.  Next,  for  p  €  r(a,  b) 
with  a  ^  1  we  have  for  all  s  ^  0 

p{t  —  s)  ( t  —  s\a~x  b  b 

—  7 . . -  =  - )  ebs  ->  e°  as  t ->  oo. 

p(t)  \  t  ) 

Hence,  condition  (3)  is  met  with  hp(s)  =  exp{fo}.  Further,  assuming  that  a  >  1  we  obtain,  for  any 
p.d.f.  g  from  the  family  (23), 

[  =  [  ^**1®*)*  - 1  +4A*  <  °° 

by  the  Lebesgue  theorem  on  dominated  convergence,  which  is  condition  (4).  Finally,  condition  (5) 
also  holds.  This  leads  us  to  a  conclusion  that  if  promotion  time  T  has  a  gamma  distribution 
r(a,  b)  with  a  ^  1  and  tumor  growth  is  exponential  with  rate  X,  then  parameters  a,  b,  a  and  X  are 
jointly  identifiable  from  the  observed  distribution  of  the  age  at  spontaneous  detection. 

Remark  3.  It  is  easy  to  check  that  the  convolution  of  the  density  corresponding  to  the  MVK 
model  given  by  (16)  and  the  p.d.f.  specified  by  (23)  does  not  satisfy  the  sufficient  conditions 
formulated  in  Theorem  1 .  The  same  is  true  for  this  convolution  if  the  MVK  model  is  replaced  with 
the  Y-P  model.  This  does  not  mean,  however,  that  these  convolutions  are  non-identifiable,  but 
more  powerful  analytical  results  are  necessary  to  clarify  their  properties  associated  with  the  notion 
of  identifiability. 


7.  Model  stability 

Suppose  the  law  of  tumor  growth  is  described  by  an  increasing  function  /.  A  natural  question 
to  ask  is  how  sensitive  is  the  distribution  of  the  r.v.  W  given  by  (4)  to  the  change  of  the  law  of 
tumor  growth.  Solving  this  problem  presupposes  the  choice  of  two  metrics  that  measure  distances 
between  the  distributions  of  W  and  functions  /,  and  establishing  a  relation  between  these  metrics. 

For  two  survival  functions  F\,F2  of  non-negative  random  variables,  denote 
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PcoM)  ■=  sup{|F,(0 -F2(t)\  :t>  0}.  (25) 

Obviously,  this  formula  defines  a  probability  metric.  Next,  let  S'  be  the  set  of  all  non-negative 
increasing  functions  on  [0,  oo).  Given  a  >  0,  define,  for  any  functions  fi,fi  £  S' , 

dJJ\ ,fi)  :=  inf  max  (a  [  \f,{t)  -  f2(t) |  d t, 

T>Q  l  JO 

exp(_a/  J‘Wd/)>  exp(~a/  J2Wd/)}-  (26) 


It  is  readily  checked  that  da  is  a  metric  on  S'.  Also,  it  is  easy  to  verify  that  for  all  fi,f2  6  S'  ,f\  / 
f2,  there  exists  a  unique  number  T0  =  To(f\,f2)  such  that 


ijjuf. 2)  =  a  f  |/i (t)  -f2(t)\dt 

JO  T 

afo 


=  max  <  exp 


exp 


‘  [r°f2(t)dt 

Jo 


Theorem  2.  Let  f\,fi  G  ^  be  two  laws  of  tumor  growth ,  and  let  F\,F2  be  the  corresponding  survival 
functions  defined  in  (4).  Then 

Po*{FuF2)  ^da(f\,f2). 


Proof.  Fix  T  >  0.  Denote 


da{f\Ji\T)  :=max|a^  -  f2(t)\dt,  exp  ^  -  a  jf  /i(f)d<Y  exp^-a^  /2(t)d^|, 


so  that  in  view  of  (26) 

da(fuf2)  =  inUa(fhf2-,T). 

Observe  that  according  to  the  mean  value  theorem  for  all  x,  y>  0, 
|e-ax  —  e~oy'|  <  <x\x  -  y|. 

Therefore,  for  O^t^T,  we  have 


’■J  f\(t)dt]  -  exp  ^-a  f2(t)dt\ 
-r 


exp  y  —  a 

/  \A0)ds-  f  f2{s)ds\^oc  [  \f\{t)  -  f\{t)\ 
Jo  Jo  Jo 

^dztfufr,  T ). 


d  t 


(27) 


(28) 


Next,  for  t  >  T , 
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\F[  (t)  -  F2(t) |  <  max  j  exp  ^  -  a  jf  fx  ( u )  d u^j ,  exp  ^  -  a  ^  f2(u) duj  | 
<  max  |  exp  ^  -  a  J  f\(u) dw^ ,  exp  ^  -  a  J  f2(u) dw^  | 


<da(fuf2;T). 


(29) 


Combining  (28)  and  (29)  we  find  that,  for  all  t  ^  0, 

\Fl(t)-F2(t)\^dx(fuf2;T),  T>  0. 

Therefore,  p00(Fl,F2)  <  dri(fuf2]  T)  for  all  T  >  0,  which  in  view  of  (27)  concludes  the  proof  of 
Theorem  2. 


8.  Joint  distribution  of  age  and  tumor  size  at  detection  and  its  randomized  form 

Recall  that  T  is  the  age  at  tumor  onset,  W  is  the  time  of  spontaneous  tumor  detection  measured 
from  the  onset  of  disease,  and  S  is  the  tumor  size  at  spontaneous  detection.  Then  S  =  f(W),  where 
/  :  [0,  oo)  -»  [1,  oo)  is  a  deterministic  function  describing  the  law  of  tumor  growth.  It  is  assumed 
that 

1 .  random  variables  T  and  W  are  absolutely  continuous  and  independent; 

2.  function  /  is  differentiable  and  f  >  0; 

3.  the  rate  of  spontaneous  tumor  detection  is  proportional  to  the  current  tumor  size  with  coeffi¬ 
cient  a  >  0. 

It  follows  from  Assumption  3  that  the  p.d.f.  of  the  random  variable  W is  described  by  formula 

(22). 

We  observe  sample  values  of  the  random  vector  Y  (T  +  W,S)  whose  components  are  in¬ 
terpreted  as  age  and  tumor  size  at  spontaneous  detection,  respectively.  We  look  at  Y  as  a 
transformation  of  the  random  vector  X  :=  (T,  W),  Y  =  (p(X),  where  cp(t,w)  =  (t  +  w,f(w)), 
t,  w  ^  0.  Observe  that  components  of  X  are  independent  random  variables.  The  inverse  function 
i j/  =  (p~l  :A->  IR+,  where  A  :=  {(«,  v)  e  IR+  :  1  ^  v  ^/(w)},  is  given  by  t //{u,  v)  =  (u-  g{v),g(v)), 
with  g  :=  f~x.  Note  that  the  Jacobian  of  if/  is  g'.  Then  for  the  p.d.f.  of  Y  we  have  assuming  that 
(. u ,  v)  e  A 

pY(u,v)  =px(il/(u,v))g'(v) 

=  pT(u  -  g(v))pw(g(v))g'{v) 

=  Pt{u  -  g(v))ps(v). 

In  the  particular  case  of  exponential  tumor  growth  with  rate  X  >  0  (f  (w)  =  e;-lv)  we  obtain  using 
formula  (7) 

Py{u,v)  =^X~^v~l)pT{u-^-\,  u  ^  0,  l^v^eXu.  (30) 

Thus,  the  distribution  of  random  vector  Y  is  absolutely  continuous  but  the  support  of  Y  depends 
on  the  unknown  parameter  X.  As  far  as  the  asymptotic  likelihood  inference  is  concerned,  the  usual 
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regularity  conditions  are  not  met  for  the  distribution  pY.  However,  experience  with  similar 
parametric  settings  suggests  that  the  estimation  efficiency  for  the  parameter  X  may  be  expected  to 
be  even  higher  than  in  the  regular  case  although  asymptotic  normality  may  fail. 

Suppose  the  distribution  of  the  time  of  tumor  latency  T  is  known.  Let  {(«,■,«,):  l^i^n}  be 
sample  data  on  age  and  tumor  size  at  detection.  The  structure  of  the  joint  distribution  (30) 
suggests  the  following  maximum  likelihood  procedure  for  estimation  of  the  parameters  a  and  X: 

(1)  Denote  6  =  tx/X  in  formula  (30),  and  find  the  maximum  likelihood  estimate,  0,  of  the  pa¬ 
rameter  0  using  only  the  tumor  size  data  {«;,  :  1  <  i  <  »}.  It  follows  (see  below)  that  the  sample  {«,-} 
is  drawn  from  an  exponential  distribution  with  parameter  0,  and  consequently 


(2)  Maximize  the  function 

L(X)  =  (“t  ~ J.  u>  >  °>  V/>\, 

or  its  logarithm,  to  find  the  estimate  of  X  denoted  by  X. 

(3)  The  maximum  likelihood  estimate  of  a  is  given  by  a  =  OX. 

The  above  procedure  does  the  same  job  as  maximizing  the  likelihood  function  based  on  the 
joint  distribution  (30).  To  show  this,  let  the  joint  density  of  the  random  variables  U  and  Fbe  of 
the  form 

p(u,  v;  X,  0)  =  g(v ;  0)f  ^ 

with  u  >  0,  v  ^  1 ,  X  >  0,  and  (p  :  [1 ,  oo)  — >  (0,  oo).  It  is  assumed  that  g(x)  >  0  for  x 

for  t  >  0,  and  f(t)  =  0  for  t  ^  0.  Suppose  that  there  exists  a  unique  maximizer  (i,  0)  for  the 

likelihood  function 

n 

L(X,  6)  =  £[/>(«;,  t;,;  A,  0). 

i=i 

It  is  clear  that  X  and  0  are  unique  maximizers  for  the  functions 

li(A)  =  n/(«/  ~^r)  and  L2(e)  =  flg(vr,0), 

respectively.  Conversely,  if  X  >  0  and  6  are  unique  maximizers  for  these  functions,  the  pair  (X,  0)  is 
a  unique  maximizer  for  the  likelihood  function  L(X,  6).  Finally,  observe  that  g(v;  0)  is  the  marginal 
density  of  the  random  variable  V.  Indeed,  we  have 

/  f{u~^Y'SSjs(v,0)du  =  g(v,e)J^  f(t)dt  =  g(v;  0). 

The  performance  of  the  above-described  estimation  procedure  was  studied  by  computer  simu¬ 
lations.  A  total  of  50  pseudo-random  samples  of  (w„  vt)  were  generated  from  the  joint  distribution 
(30);  each  sample  contained  n  —  100  realizations  of  the  random  vector  (U,V).  We  used  the 
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composition  method  [49]  to  simulate  samples  of  pairs  »,•).  In  accordance  with  this  method,  we 
first  draw  vt  from  the  marginal  distribution  of  the  random  variable  V,  and  then  generate  u,  from 
the  distribution  of  U  conditional  on  V  =  vt.  The  p.d.f.  pT  was  specified  by  the  MVK  model  with 
the  survival  function  given  by  formula  (16).  We  used  the  following  values  of  model  parameters: 
a  =  2.3  x  10_10,2  =  6.9, A  =  10 ~4,B  =  0.1821,5  =  0.0364.  These  values  are  suggested  by  the 
analysis  of  lung  cancer  data  (Sections  4  and  5). 

8.1.  Simulation  Experiment  1 

In  this  experiment,  we  kept  the  parameters  A,B,  and  5  at  their  true  values  and  applied  the 
estimation  procedure  to  simulated  data  in  order  to  obtain  estimates  of  the  parameters  X  and  a.  In 
this  case,  the  likelihood  function  can  be  maximized  by  a  univariate  search  for  X  with  a  fixed  value 
of  9.  The  estimates  of  X  and  a  which  resulted  from  each  of  the  50  samples  were  summarized  by 
calculating  their  sample  means  X  and  a,  as  well  as  the  corresponding  standard  errors  (of  the 
sample  mean)  denoted  by  ax  and  cr5,  respectively.  We  obtained  the  following  numerical  values: 
X  =  7.45,  a i  =  0.9,  a  =  2.53  x  10-10,  oR  —  0.34  x  10~10.  These  results  testify  that,  given  the  pa¬ 
rameters  A,B  and  5  are  known,  the  estimation  procedure  performs  well  when  applied  to  finite  but 
sufficiently  large  samples. 

8.2.  Simulation  Experiment  2 

Proceeding  from  the  same  true  parameter  values,  the  estimation  procedure  was  applied  to 
simulated  data  to  obtain  estimates  of  all  the  parameters  incorporated  into  the  model.  We  used 
algorithm  FLEXI  [50]  to  maximize  the  corresponding  likelihood  function.  Since  there  were  three 
additional  parameters  to  be  estimated  from  simulated  data,  the  size  of  each  sample  was  increased 
up  to  1000.  The  results  were  summarized  in  just  the  same  way  as  in  Experiment  1  to  give:  X  =  9.4, 
G-k  =  0.9,  a  =  3.1  x  10~10,  <rg  =  3.1  x  10  n,  A  =  9.5  x  10'4,  aA-=  3.6  x  10"4,  5  =  0.1407, 
—  0.0599,  b  =  0.0507,  aj  =  0.006.  The  50  estimates  of  the  p.d.f.  pr(t)  with  the  above  estimates 
substituted  for  its  parameters  were  averaged  for  each  value  of  t.  The  resultant  average  (arithmetic 
mean)  agrees  quite  closely  with  the  true  p.d.f.  pT  as  seen  in  Fig.  6. 

8.3.  Simulation  Experiment  3 

The  estimation  procedure  was  applied  to  a  single  sample  of  size  50000  generated  from  the  joint 
distribution  (30).  The  estimated  parameter  values  were:  X  =  6.7,  a  =  2.24  x  10-10,  A  =  5.1  x  10~4, 
B  —  0.1390,  <5  =  0.0475.  The  true  p.d.f.  pT  is  compared  with  its  parametric  estimate  in  Fig.  7. 

The  above  simulation  experiments  show  that  estimation  of  the  whole  set  of  model  parameters  is 
feasible  given  the  model  is  adequate  for  the  processes  under  study,  but  obtaining  unbiased  esti¬ 
mates  would  require  large  sample  sizes. 

Suppose  now  that  the  process  of  tumor  growth  is  described  by  the  exponential  law 
f(w)  =  eAw,  w  ^  0,  with  a  random  growth  rate  X.  We  also  assume  that  the  random  parameter 
9  :=  oc/X  is  gamma  distributed  with  parameters  a  and  b.  Compounding  (30)  with  respect  to  the 
gamma  distribution  of  the  parameter  6  we  find  the  p.d.f.  of  the  resulting  randomized  distribution 
of  the  vector  Y 
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Fig.  6.  Simulation  Experiment  2:  Comparison  of  the  true  p.d.f.  pT  specified  by  the  MVK  model  (solid  line)  with  the 
corresponding  estimate  (dashed  line)  obtained  by  averaging  over  50  parametric  estimates  of  the  p.d.f.  pT. 


Fig.  7.  Simulation  Experiment  3:  Comparison  of  the  true  p.d.f.  pT  specified  by  the  MVK  model  (solid  line)  with  the 
corresponding  estimate  (dashed  line)  obtained  from  a  single  sample  of  50  000  pairs  of  observations  drawn  from  the  joint 
distribution  (30). 


7  a  rctu/lnv  /  |n  ,,  \ 

p{u ,  V )  =  Yj-j  J  tnQ~(b+v~x),pT -  —  t)  d/,  u  5*  0,  V  ^  1 . 

Setting  s  :=  u  -  (lny/a)/  we  rewrite  the  last  formula  in  an  equivalent  form 

p(u-v)=W)^y*'  l  (3i) 


R.  Bartoszyhski  et  al  /  Mathematical  Biosciences  171  (2001)  113-142 


135 


for  u  ^  0,  v  ^  1.  Alternatively,  we  may  assume  that  it  is  the  parameter  1/A  that  is  gamma  dis¬ 
tributed  with  parameters  a  and  b.  Should  this  be  the  case,  we  would  have 


xba  fu/lnv 

P(U’V)=T^~ T  /  ta exp{-[b  +  a(v  -  l)]t}pT{u  -  t\nv)dt 
r{a)  Jo 


oiba 


(In  v)a+lr(a) 


fur  f  b  +  a(u-l)  | 

J  (u-s)  exp  | - ^ - (m  -  s)  jM-s)  as 


(32) 


for  u  ^  0,  v  >  1. 

Once  the  density  of  the  age  at  tumor  onset  T  is  specified  within  a  certain  parametric  family 
(e.g.  using  a  gamma  distribution  or  the  MVK  model),  Eqs.  (31)  or  (32)  allow  us  to  compute  p.d.f. 
of  the  joint  distribution  of  age  and  tumor  size  at  detection.  Observe  that  in  this  randomized 
version  the  support  [0,  oo)  x  [1,  oo)  of  the  distribution  of  random  vector  Y  is  parameter-free.  The 
maximum  likelihood  parametric  inference  based  on  the  joint  p.d.f.  p(u,  v)  accommodates  censored 
observations  under  the  usual  censorship  model  [51], 


9.  A  threshold  model  of  tumor  detection 


An  alternative  approach  to  stochastic  modeling  of  spontaneous  detection  was  proposed  and 
extensively  discussed  in  [38,52,53],  The  main  postulate  of  the  model  developed  in  these  works  is 
that  a  tumor  becomes  detectable  when  its  size  attains  some  threshold  value,  N,  which  is  treated  as 
a  random  variable.  The  authors  used  a  linear  pure  birth  process  with  random  absorbing  upper 
barrier  N  to  model  the  dynamics  of  tumor  growth.  Under  this  model  the  progression  time  cu¬ 
mulative  distribution  function,  given  the  threshold  level  N,  is 

F(t|A0  =  (l-e-T"‘,  (33) 

where  A  is  the  birth  rate.  Formula  (33)  implies  that  tumor  growth  starts  from  a  single  malignant 
cell  at  time  t  =  0. 

As  mentioned  in  Section  4,  it  is  practical  to  represent  the  critical  number  of  tumor  cells  as 
N  =  mV,  where  V  is  the  volume  of  a  tumor,  and  m  —  \/c  is  the  concentration  of  tumor  cells  per 
unit  volume.  The  constant  m  is  non-random  and  its  values  are  typically  large.  Thus  the  condi¬ 
tional  progression  time  c.d.f.,  given  the  threshold  volume  V  =  v,  is 

F(t\v)  =  (1  -e"*)""’-1.  (34) 


Let  f(t\v)  stand  for  the  p.d.f.  of  F(t\v). 

Let  G(t)  be  the  survival  function  of  the  time  it  takes  for  the  initiation  and  promotion  processes 
to  result  in  the  event  of  neoplastic  transformation.  Assuming  that  the  initiation  rate  is  constant, 
i.e.  9{t)  =  6,  and  the  stages  of  promotion  and  progression  are  mutually  independent,  the  authors 
of  [52]  used  the  convolution 


g(f|t>)  =  6  f  K(t  —  u)  exp 
Jo 


f(u\v)  d  u 


(35) 


to  represent  the  conditional  p.d.f.,  g(f|»),  of  the  time  of  tumor  latency  measured  from  the  date  of 
birth  of  an  individual  (see  formulas  (19)  and  (21)). 
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Introducing  a  prior  distribution,  P(v),  of  the  random  variable  V,  we  represent  the  p.d.f.  of  the 
time  (age)  of  tumor  detection  as 


g(t\v)p(v)dv, 


(36) 


where  p(v)  is  the  density  of  P(v),  and  the  distribution  P(v)  is  assumed  to  have  finite  first  moment. 
One  is  primarily  interested  in  the  conditional  p.d.f.  of  tumor  volume  at  detection  (given  a  tumor  is 
detected  at  time  t),  hereafter  denoted  by  w(v\t).  By  virtue  of  Bayes’  formula  we  have 


w(v\t)  — 


g(*lp)/>(p) 

/o°°  g(t\u)p(u)du 


g(  0 


(37) 


where  g(t\v)  and  g(t)  are  given  by  (35)  and  (36),  respectively. 

This  model  yields  an  interesting  asymptotic  result  showing  that  the  conditional  p.d.f.  w(v\t) 
assumes  a  very  simple  form  when  t  tends  to  infinity.  This  limiting  form  does  not  involve  the 
promotion  time  distribution  K  and  it  also  has  some  distinct  advantages  as  far  as  estimation 
problems  are  concerned  [53],  It  follows  from  (34)  and  (35)  that 

g(t|u)  =  XBmv  J  e~;('_s)(l  —  g-'l('-*))m,’-lzf(5) exp  |  —  6  J  /f(x)di|  ds  =  /£mv\l/(t). 


Proven  in  [53]  is  the  following  theorem. 


Theorem  3.  The  following  assertions  hold  for  the  limiting  behavior  of  the  function  i j/(t)  as  t  — >  oo: 
1.  If  X  <  6,  then 

\)/(t)  ~  Ie~Xl, 

where 


,  =  l  txv{h-el  AT(jc)djc|^(5)  d s. 

2.  If  X  =  6  and  /0°°[1  —  A^(5,)]ds’  <  oo,  then 

ij/(t)  ~  /  exp  {-1/'  AT(j:)dxj>. 

3.  If  X>  9,  then 

\J/(t)  ~  Jexp  /  —  6  (  AT(x)dx 


with 


J  =  \ /  /"’"'(I  -y)'mdy  =  \simv, i  -  9/X), 

where  B(x,y )  is  the  beta  function. 
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Corollary  1.  Ifmv  — ►  oo,  it  follows  from  Theorem  2  that  the  limiting  conditional  p.d.f  of  tumor  size 
at  detection  is  of  the  form 


lim  w(v\t) 

t— KX) 


v^p(v) 


r  ^p(u)du 


H  =  min 


A  special  case  (p=  1)  of  this  distribution  is  known  as  the  length-biased  sampling  distribution  in 
the  theory  of  stationary  point  processes  [54],  Analysis  of  data  on  breast  cancer  [53]  revealed  that 
the  limiting  distribution  is  quite  adequate  starting  with  the  age  of  50. 

In  studying  stability  of  the  posterior  p.d.f.  of  tumor  volume  at  detection  under  perturbation  in 
the  prior  p.d.f.  p  [53],  the  following  metric: 

poo 

Pn(fJ)-=  u^f(u)-f{u)\du 
Jo 

is  a  natural  choice.  The  corresponding  theorem  and  some  other  stability  results  are  described  at 
length  in  [53].  An  extended  discussion  of  the  above-described  model  in  the  context  of  parametric 
analysis  of  clinical  data  on  breast  cancer  is  presented  in  [55], 


10.  Modeling  metastatic  process 

Metastatic  progression  of  cancer  can  be  modeled  proceeding  from  the  assumption  that  shed¬ 
ding  a  metastasis  is  a  quantal  response  event.  From  the  practical  point  of  view,  it  seems  important 
to  derive  a  formula  that  gives  the  probability  that  at  the  time  of  detection,  the  primary  tumor  has 
not  yet  metastasized.  This  problem  is  tractable  if  we  resort  to  a  deterministic  description  of  tumor 
growth.  In  doing  so,  we  still  can  incorporate  an  additional  randomness  into  the  model  by  allowing 
for  random  values  of  its  numerical  parameters. 

To  derive  a  useful  formula  for  the  probability  of  the  event  of  interest  suppose  that,  in  the  course 
of  growth,  the  tumor  produces  metastases  with  an  intensity  which  is  proportional  (with  coefficient 
0  to  its  current  size  N(t).  We  assume  that  N(t)  is  a  non-random  function  of  time.  Specifically,  we 
assume  that  the  process  of  shedding  metastases  by  the  primary  tumor  is  Poisson  with  intensity 
CN(t).  Each  metastasis  is  assumed  to  develop  independently  of  the  subsequent  growth  of  the 
primary  tumor  and  of  other  metastases.  Any  given  metastasis  grows  deterministically  and  the  rate 
of  its  detection  is  proportional  (with  coefficient  a2)  to  the  current  metastasis  volume.  In  other 
words,  given  the  time  of  metastasis  origination  t,  the  rate  of  its  detection  at  time  t  >  t  equals 
<x2M(t  —  t),  where  Mis  the  size  of  the  metastasis.  The  primary  tumor  is  detected  at  a  random  time 
W\  measured  from  the  time  of  tumor  onset  t  =  0.  The  rate  of  the  primary  tumor  detection  at  time  t 
is  equal  to  c/.\N(t).  Let  W2  be  the  time  (also  measured  from  t  =  0)  to  the  detection  of  the  first 
metastasis. 

Yorke  et  al.  [56]  considered  a  model  of  metastatic  spread  under  a  deterministic  threshold 
mechanism  of  tumor  detection.  The  authors  used  similar  assumptions  on  growth  characteristics  of 
primary  and  metastatic  tumors,  but  they  did  not  evaluate  the  probability  Pr{Wx  <  W2}  in  their 
computations. 
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Note  that 


n(t)  =  e-“2foM{u)du  (38) 

is  the  probability  that  a  metastasis  of  age  t  remains  undetected.  The  probability  that  the  primary 
tumor  produces  k  metastases  during  the  time  t  is 

Pk{t)  =-t  (cjf  N(u)du^j  exp  |  -  C  N(w)dwj,  k  ^  0. 


Let  t,  be  the  time  of  the  ith  metastasis  formation  given  that  the  number  of  metastases  produced  by 
time  t  is  equal  to  k  and  i  ^  k.  Now  we  use  the  well-known  fact  that  Ti , . . . ,  t*  are  independent  and 
identically  distributed  random  variables,  having  the  common  density 


(  «(») 
p(u)  =  < 

lo, 


u  >  t, 


see  [57],  Then 


II  =  Pr{ W\  <  w2} 


(39) 


where  n(t)  and  p(t)  are  given  by  formulas  (38)  and  (39),  respectively. 

The  last  formula  assumes  an  especially  simple  form  if  the  growth  functions  N{t )  and  M(t )  are 
exponential  with  constant  proliferation  rates  X\  and  X2,  respectively.  In  this  case,  introducing  the 
notation 


p(u)n(t 


u )  d  w 


eA,/  —  1 


-/JH, 


d  u, 


we  have 


r°°  (  e;''  -  1  ) 

n  =  aj  J  exp|/lit - J — [<X|  +C(1  -6(t))]|d^.  (40) 

Incorporation  of  the  distribution  of  time  T  of  tumor  onset  into  the  above  formulas  presents  no 
difficulties.  Randomized  versions  of  the  model  are  also  pretty  straightforward. 

We  used  formula  (40)  to  conduct  numerical  experiments  with  the  aim  to  study  the  behavior  of 
Pr{W\  <  W2 }  as  a  function  of  model  parameters. 


Example  1.  In  this  numerical  experiment,  the  primary  and  the  secondary  (metastases)  tumors 
grow  at  the  same  rate  (Xi  —  X2)  with  the  corresponding  detection  rate  constants  being  equal 
(a]  =  a2 ).  Computations  were  carried  out  in  accordance  with  formula  (40)  at  the  following  values 
of  model  parameters:  a,  =  a2  =  0.03,  X\  =  k2  =  3.0.  These  values  have  no  relevance  to  any  actual 
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C 

Fig.  8.  The  probability  =  <  W2}  as  a  function  of  £.  The  other  parameter  values  are:  ol\  =  a2  =  0.03, 

Ai  =  A2  =  3.0. 

biological  data  and  serve  only  as  a  purpose  of  illustration.  The  dependence  of  77  =  Pr{W,  <  W2} 
on  the  parameter  £  is  shown  in  Fig.  8.  This  dependence  appears  to  be  fairly  flat  in  the  given  range 
of  model  parameters. 

Example  2.  Presented  in  Table  1  are  the  values  of  n  =  Pr{W\  <  W2j  as  a  function  of  the  sensi¬ 
tivity  parameter  a2  and  the  metastatic  growth  rate  X2.  The  dependence  of  77  on  the  sensitivity 
parameter  a2  in  the  case  of  equal  growth  rates  for  the  primary  and  metastases  is  shown  in  Fig.  9. 
All  profiles  of  the  dependencies  under  study  indicate  a  low  sensitivity  of  the  probability  77  to 
variations  in  the  parameters  of  this  model. 

The  probability  77  can  also  be  estimated  non-parametrically  by  the  corresponding  relative 
frequency  in  cohort  studies.  Therefore,  the  usefulness  of  the  above  analytic  formula  for  77  consists 
mainly  in  that  the  parametric  estimate  can  be  compared  with  its  non-parametric  counterpart, 
thereby  suggesting  an  additional  criterion  for  model  validation.  However,  this  criterion  seems  to 
be  rather  weak.  Indeed,  the  results  of  the  two  numerical  experiments  suggest  that  the  probability 
77  is  relatively  insensitive  to  the  basic  parameters  incorporated  into  the  model.  In  other  words, 
inaccuracy  of  parameter  estimates  (obtained  from  data  on  age  at  diagnosis  of  the  primary  tumor 
and  metastasis-free  survival)  does  not  appear  to  affect  much  the  resultant  estimate  of  the  prob¬ 
ability  77;  this  conclusion  is  very  preliminary  since  it  has  been  drawn  from  numerical  experiments 
carried  out  in  a  limited  range  of  parameter  values. 


Table  1 


The  probability  77 

=  Pr{W\  <  W2}  as  a  function  of  a2  and  X: 

(ai  =  0.03,  X\ 

=  3.0,  C  =  0.1) 

« 2 

h 

2.5 

3.05 

4.0 

0.015 

0.956 

0.903 

0.03 

0.936 

0.919 

0.896 

0.874 

0.848 

0.885 

0.861 

0.831 

0.804 

0.775 
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Fig.  9.  The  probability  77  =  Pr{W{  <  W2}  as  a  function  of  a2.  The  other  parameter  values  are:  —  0.03, 

kx  =  A2  ==  3.0,  C  =  0.1. 
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Abstract — A  population-based  cohort  consisting  of  126,141  men  and  122,208  women  born  be 
tween  1874  and  1931  and  at  risk  for  breast  or  colorectal  cancer  after  1965  was  identified  by  linking 
the  Utah  Population  Data  Base  and  the  Utah  Cancer  Registry.  The  hazard  function  for  cancer  inci¬ 
dence  is  estimated  from  left  truncated  and  right  censored  data  based  on  the  conditional  likelihood. 
Four  estimation  procedures  based  on  the  conditional  likelihood  are  used  to  estimate  the  age-specific 
hazard  function  from  the  data;  these  were  the  life-table  method,  a  kernel  method  based  on  the  Nelson 
Aalen  estimator,  a  spline  estimate,  and  a  proportional  hazards  estimate  based  on  splines  with  birth 
year  as  sole  covariate. 

The  results  are  consistent  with  an  increasing  hazard  for  both  breast  and  colorectal  cancer  thiough 
age  85  or  90.  After  age  85  or  90,  the  hazard  function  for  female  breast  and  colorectal  cancer  may  reach 
a  plateau  or  decrease,  although  the  hazard  function  for  male  colorectal  cancer  appears  to  continue 
to  rise  through  age  105.  The  hazard  function  for  both  breast  and  colorectal  cancer  appears  to  be 
higher  for  more  recent  birth  cohorts,  with  a  more  pronounced  birth-cohort  effect  foi  breast  cancel 
than  for  colorectal  cancer.  The  age  specific  hazard  for  colorectal  cancer  appeals  to  be  higher  for  men 
than  for  women.  The  shape  of  the  hazard  function  for  both  breast  and  colorectal  cancer  appear  to 
be  consistent  with  a  two-stage  model  for  spontaneous  carcinogenesis  in  which  the  initiation  rate  is 
constant  or  increasing.  Inheritance  of  initiated  cells  appears  to  play  a  minor  role,  (c)  2001  Elsevier 
Science  Ltd.  All  rights  reserved. 


Keywords — Hazard  function,  Truncation,  Survival  analysis,  Breast  cancer,  Colorectal  cancer. 


1.  INTRODUCTION 

The  shape  of  the  hazard  function  may  lead  to  insights  into  the  biology  of  carcinogenesis  which 
may  not  be  easily  discernable  from  a  study  of  the  survival  function  alone.  For  example,  it  is 
typical  in  the  analysis  of  tumor  recurrence  data  to  find  a  hazard  function  that  is  bimodal  or 
unimodal,  and  that  tends  to  zero  as  time  tends  to  infinity  [1].  The  modes  of  the  hazard  may 
be  interpreted  biologically  as  arising  from  two  different  types  of  failure,  one  that  tends  to  occur 
earlier  and  one  that  tends  to  occur  later.  The  decrease  in  the  hazard  function  to  zero  may  lead 
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one  to  conclude  that  there  is  a  nonzero  cured  fraction.  In  fact,  if  we  let  A (t)  denote  the  hazard 
function,  and  p  the  probability  of  cure,  it  follows  from  the  formula 


p  =  lim  exp 

t — ►oo 


A(w) du 


} 


that  there  are  individuals  who  have  been  “cured”  in  the  population  exactly  when  the  hazard 
function  has  finite  integral.  In  particular,  lim/_00A(i)  =  0,  provided  the  limit  exists. 

If  the  hazard  function  under  study  is  from  disease  incidence,  the  “cured  fraction”  must  be 
reinterpreted  as  the  fraction  of  the  population  that  is  “immune”  to  the  disease.  If  the  cumulative 
hazard  appears  to  be  bounded,  for  example,  one  should  expect  the  existence  of  a  nonzero  immune 
fraction.  More  generally,  a  large  degree  of  heterogeneity  in  disease  susceptibility  may  lead  to  a 
population  hazard  function  with  one  or  more  well-defined  maxima.  The  maxima  may  correspond 
to  discrete  subpopulations  with  different  genetic  predisposition  to  disease.  A  maximum  may  also 
result  from  a  continuous  frailty,  as  the  surviving  population  at  higher  ages  may  be  overrepresented 
by  individuals  with  lower  risk  [2]. 

Both  breast  and  colorectal  cancer  are  syndromes  in  which  an  inherited  susceptibility  has  been 
shown  to  play  a  role.  Inherited  mutations  in  p53,  BRCA1,  BRCA2,  the  ataxia-telangiectasia 
gene  (AT),  HRAS,  and  the  androgen  receptor  gene  (AR)  have  been  shown  to  play  a  role  in  breast 
cancer  susceptibility  [3].  About  56%  of  carriers  of  the  mutation  BRCAl  or  BRCA2  will  get 
breast  cancer  by  the  age  of  70  years  [4],  BRCAl  has  an  estimated  allele  frequency  of  between 
0.0002  and  0.001  (95%  Cl)  [5],  and  accounts  for  about  3%  of  diagnosed  breast  cancer  [6].  The 
allele  frequency  of  mutations  in  BRCA2  is  estimated  at  0.00022  [7].  Germline  mutations  in 
p53  and  AR  are  extremely  rare,  and  mutations  in  the  HRAS1  minisatellite  locus  which  confer 
increased  risk  of  breast  cancer  are  also  rare,  having  an  estimated  population  frequency  of  6%  [3], 
In  a  study  of  100  Finnish  breast  cancer  families  analyzed  by  protein  truncation  tests  and  direct 
sequencing,  Vehmanen  et  al.  [8]  found  that  only  21%  of  breast  cancer  families  were  accounted 
for  by  mutations  of  BRCAl  and  BRCA2,  providing  indirect  evidence  for  the  existence  of  other, 
undiscovered  breast  cancer  genes. 

Indirect  evidence  also  exists  for  the  existence  of  additional  colorectal  cancer  genes.  Inherited 
mutations  in  polyposis  coli  (APC)  gene  and  the  hereditary  nonpolyposis  colon  cancer  syndrome 
(HNPCC)  genes  hMSH2,  and  hMLHl  have  been  shown  to  play  a  role  in  colon  cancer  susceptibil¬ 
ity  [3].  After  segregation  analysis  of  203  pedigrees,  Houlston  et  al.  [9]  concluded  that  dominant 
colorectal  cancer  genes  with  a  frequency  of  0.006  account  for  an  estimated  81%  of  colorectal 
cancers  in  patients  under  35,  59%  in  patients  between  35  and  49,  decreasing  to  16%  in  patients 
over  65.  The  I1307K  mutation  of  the  APC  gene,  found  in  Ashkenazi  Jews,  confers  an  estimated 
relative  risk  of  1.7  for  colorectal  cancer  (95%  Cl  1.01-2.87)  [10].  APC  and  HNPCC  are  rare,  and 
contribute  to  a  small  percentage  of  colorectal  cancer  cases  [3]. 

Additional  insight  can  be  gleaned  from  the  hazard  function  for  cancer  incidence  in  the  frame¬ 
work  of  a  mechanistic  model  of  carcinogenesis.  The  most  widely  accepted  model  is  the  Mool- 
gavkar-Venzon-Knudson  two-stage  clonal  expansion  model  [11,12],  The  Moolgavkar-Venzon- 
Knudson  model  has  the  following  assumptions. 

Assumption  A.  Normal,  susceptible  target  cells  are  initiated  according  to  a  (nonhomogeneous) 
Poisson  process  with  intensity  v(t). 

Assumption  B.  The  expansion  of  the  colony  of  initiated  cells  and  malignant  transformation  is 
specified  by  a  stochastic  birth-death-migration  process  with  the  division,  death  (or  differentiation) 
and  transformation.  Premalignant  cells  either  divide  into  two  premalignant  cells  with  rate  «(f), 
die  with  rate  p(t),  or  divide  asymmetrically  into  one  premalignant  cell  and  one  malignant  cell 
with  rate  //(f). 

It  has  been  shown  that  the  hazard  function  for  the  Moolgavkar-Venzon-Knudson  model  with 
constant  parameters  increases  monotonically  and  approaches  an  asymptote  [13],  An  asymptotic 
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value  for  the  hazard  is  also  reached  for  the  Moolgavkar-Venzon-Knudson  model  with  piecewise 
constant  parameters,  and  in  that  case,  the  value  of  the  asymptote  depends  only  on  the  value  of 
the  coefficients  in  the  unbounded  interval  [13,14]. 

Expressions  for  the  survivor  function  were  first  obtained  by  Moolgavkar  and  Luebeck  [13].  A 
simple  explicit  formula  for  the  survivor  function  S(t)  for  the  Moolgavkar-Venzon-Knudson  model 
with  constant  parameters  was  obtained  by  Kopp-Schneider  et  al.  [15]  and  Zheng  [16], 


S(t)  = 


2  ce°-5(-a+/*+/i-c)t 


1 


L(-a  +7?  4 -/i  +  c)  +  (a  —  +  c)erct\ 


a) 


where  c  -  ^/(a  +  0  +  /x)2  -  4a/3.  Zheng  also  presented  an  expression  for  the  probability  gen¬ 
erating  function  for  the  number  of  malignant  cells  given  a  single  malignant  cell  at  time  t  =  0, 
allowing  an  expression  for  the  promotion  time  distribution 


F{t)  = 


(a  —  P  —  n  +  c)(a  —  (3  —  ji  —  c)e  ct  +  (a  —  0  —  /x  +  c)(— a  +  0  +  p.  +  c) 
2a  [(a  —  (3  —  n  +  c)e~ct  +  (—a  +  0  +  n  +  c)] 


(2) 


to  be  given.  It  is  easy  to  see  that  S(t)  and  F(t)  above  are  related  by  the  formula 


S(t)  =  exp  v  J  F(x)  d:r| 


(3) 


which  was  shown  by  Hanin  and  Yakovlev  [17]  to  be  valid  in  a  more  general  setting. 

Yakovlev  and  Tsodikov  [18]  replace  Assumption  B  above  with  the  following  assumption. 
Assumption  C.  Progenitor  cells  are  transformed  into  malignant  lesions  at  a  random  with  cu¬ 
mulative  distribution  function  F(x).  All  progenitor  cells  are  promoted  independently  of  one 
another. 

Assuming  F(0)  =  0,  it  follows  that  the  process  of  malignant  transformation  is  also  a  Poisson 
process,  with  integral  rate  A(t)  =  /„*  v(u)F(t  -  u)  du.  As  in  the  Moolgavkar-Venzon-Knudson 
model,  the  simplest  model  of  spontaneous  carcinogenesis  takes  v{t)  =  v  to  be  constant,  in  which 
case  A(i)  =  v  f*F(u)du  and  the  hazard  function  for  time-to-tumor,  given  by  X(t)  =  vF(t),  is 
nondecreasing.  The  probability  S(t)  that  there  are  no  malignancies  by  time  t  is  then  given  by  (3). 

This  model  may  easily  be  modified  to  handle  inherited  lesions,  via  the  limiting  case  where  v  is 
taken  to  be  a  delta  function  at  the  origin.  If  F(t)  is  assumed  to  be  absolutely  continuous,  then 
the  integral  rate  A(f)  is  equal  to  i'F(f)  and  the  hazard  function  A (t)  =  F'(t )  =  f(t),  where  f(t) 
is  the  density  function  associated  with  F(t).  We  see  that  the  hazard  function  for  spontaneous 
and  inherited  lesions  are  quite  likely  to  have  very  different  shapes. 

Even  though  a  thorough  study  of  the  hazard  function  may  lead  to  new  insight  into  the  process 
of  carcinogenesis,  few  if  any  population-based  cohorts  have  been  analyzed  to  determine  the  hazard 
function  for  cancer  incidence.  In  addition,  time-dependent  variation  in  environmental  risk  factors 
for  cancer  may  cause  estimates  from  a  cross-sectional  study  to  be  misleading.  In  this  paper,  the 
age  specific  hazard  function  for  both  breast  and  colorectal  cancer  incidence  are  estimated  using 
data  from  the  Utah  Cancer  Registry  and  the  Utah  Population  Data  Base.  We  see  that  the  hazard 
function  for  both  these  types  of  cancer  appears  to  be  increasing  monotonically,  at  least  through 
age  85  or  90.  In  the  context  of  the  above  mechanistic  models  of  carcinogenesis,  we  will  see  that 
risks  for  both  these  cancers  at  the  population  level  appear  to  be  relatively  homogeneous,  with 
negligible  inherited  component. 

2.  METHODS 


2.1.  Data 

The  data  for  this  study  was  obtained  by  linking  records  from  the  Utah  Population  Data  Base 
(UPDB)  with  the  Utah  Cancer  Registry  (UCR).  The  UPDB  consists  of  the  genealogical  records 
of  more  than  1,000,000  individuals  who  were  born,  died,  or  married  in  Utah,  or  en  route  to  Utah 
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during  the  nineteenth  and  twentieth  centuries.  Since  1973  the  UCR  has  been  reporting  to  the 
National  Cancer  Institutes  Surveillance  Epidemiology  and  End  Results  (SEER)  program,  and  is 
required  to  maintain  very  high  standards  for  case  reporting  and  follow-up,  and  to  periodically  un¬ 
dergo  quality  control  audits  by  SEER  personnel  to  assure  uniformly  high  quality  and  consistency 
from  year  to  year.  The  available  follow-up  information  comes  either  from  Utah  death  certificates, 
which  have  been  linked  to  the  UPDB  genealogical  data  every  year  from  1933  through  the  begin¬ 
ning  of  1997,  or  from  linkage  of  the  HCFA  beneficiary  data  to  the  UPDB.  The  study  population 
consisted  of  126,141  men  and  122,208  women  recorded  in  the  Utah  Population  Database,  who 
were  born  from  1874  to  1931  and  for  whom  follow-up  information  is  available  that  places  them 
in  Utah  during  the  years  of  operation  of  the  Utah  Cancer  Registry  (1966-present.).  Subjects 
with  purported  follow-up  past  age  105  were  excluded  from  the  data.  There  are  5,372  cases  of 
female  breast  cancer  and  5,177  cases  of  colorectal  cancer  represented  in  the  data.  Analyses  were 
performed  on  subcohorts  based  on  birth  year  (1874-1889,  1890-1899,  1900-1909,  1910-1919,  and 
1920-1931)  and  gender.  For  each  gender,  the  entire  cohort  (birth  years  1874-1931)  was  also  ana¬ 
lyzed  as  a  whole.  The  total  number  of  subjects  and  cases  of  breast  and  colorectal  cancer  for  each 
birth  subcohort  and  gender  are  given  in  Tables  1  and  2.  Male  breast  cancer  was  not  analyzed. 

Table  1.  Number  of  female  subjects  and  cases  of  breast  and  colorectal  cancer,  strat¬ 
ified  by  birth  year. 


Birth  Years 

Number  of 
Subjects 

No.  of  Breast 
Cancer  Cases 

No.  of  Colorectal 
Cancer  Cases 

1874-1889 

10,115 

145 

116 

1890-1899 

19,352 

564 

435 

1900-1909 

27,138 

1258 

755 

1910-1919 

31,162 

1709 

752 

1920-1931 

34,441 

1696 

448 

Total 

;  122,208 

5372 

2106 

Table  2.  Number  of  male  subjects  and  cases  of  colorectal  cancer,  stratified  by  birth 
year. 


Birth  Years 

Number  of 
Subjects 

No.  of  Colorectal 
Cancer  Cases 

1874-1889 

6,850 

101 

1890-1899 

16,307 

341 

1900-1909 

27,122 

768 

1910-1919 

34,731 

874 

1920-1931 

41.131 

587 

Total 

126,141 

2671 

2.2.  Truncation:  Nonparametric  Estimation 

We  wish  to  estimate  the  age  specific  hazard  function  for  breast  and  colorectal  cancer  from  the 
data  described  above,  taking  into  account  that  the  data  is  subject  to  random  truncation:  cases 
which  occurred  during  or  before  1965  are  not  recorded  in  the  dataset.  Subjects  were  between  the 
ages  of  34  and  86,  at  the  time  of  truncation.  Thus,  analysis  of  the  data  must  take  into  account  not 
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only  to  the  effects  of  right  censoring,  but  also  the  effects  of  left  truncation  due  to  delayed  entry 
into  the  risk  set.  The  topic  of  random  truncation  is  not  mentioned  in  several  authoritative  texts 
such  as  [19]  and  [20],  and  may  be  unfamiliar  to  some  readers,  and  therefore,  will  be  discussed  in 
this  and  the  following  section. 

Let  the  truncation  time  Y  have  distribution  function  G(y)  and  the  failure  time  (time  of  cancer 
diagnosis)  X  have  distribution  function  F(x).  We  require  that  truncation  be  independent  of 
failure  and  for  simplicity  assume  no  censoring  for  the  present.  Observations  are  conditional 
on  X  >  Y.  Let  G*(y)  and  F*(x)  be  the  corresponding  distribution  functions,  conditional  on 
X  >  Y .  Let  S(x)  —  1  —  F(x)  be  the  survivor  function  of  X .  Suppose  that  we  have  observations 
(Y*,X*), . . . ,  (Y*,X*)  from  the  conditional  distribution.  The  full  likelihood  of  the  observed  data 
is  given  by 

n 

3  =  1 

where  a  =  ff  <x  dF(x)dG(y).  A  key  observation  is  that  if  X  and  Y  are  independent,  then  the 
hazard  of  X  given  X  >  Y  =  y  at  x  >  y  is  equal  to  the  hazard  of  X  at  x  [21,22].  This  observation 
leads  to  the  result,  first  mentioned  by  Kaplan  and  Meier  [23],  that  if  the  distribution  G(t)  is 
allowed  to  vary  freely,  the  natural  generalization  of  the  product  limit  estimator,  given  by  the 
formula 

^)=n(>-^).  <5> 


dF{Xj)  dG{Yj) 


(4) 


where  R(u)  =  #{Y*  <  U  <  X*}  is  the  number  at  risk  at  [/,  is  the  nonparametric  maximum 
likelihood  estimator  (NPMLE)  of  the  survivor  function  S(t)  of  X  (see,  for  example,  [21,22,24]). 

This  result  extends  naturally  to  the  case  with  random  independent  censoring  [24].  It  also 
easily  follows  that  in  the  nonparametric  setting  (again  with  no  censoring),  maximizing  (4)  is 
equivalent  to  maximizing  the  conditional  likelihood  of  (XJ\...,X*)  given  (Y^*, . . . ,  Y*),  which 


can  be  written 


CL  =  H 


f(Xt) 

S(Y*) 


(6) 


(see,  for  example,  [23-26]).  Maximizing  the  conditional  likelihood  also  leads  to  the  familiar 
Nelson- Aalen  estimator  for  the  integrated  hazard  function  H(t)  of  X  [24],  which  is  given  by 


m  =  y,  wr1-  (?) 

x*<t 

These  results  can  be  extended  to  the  case  of  right  censoring  [24], 


2.3.  Truncation:  Parametric  Models 

We  consider  the  situation  where  X  and  Y  are  independent,  F(x)  is  parametrized,  while  G(y) 
is  allowed  to  vary  freely.  In  a  later  section,  F(x)  will  be  come  from  a  quadratic  spline  model. 

The  data  are  independent  pairs  (yi,  xi), . . . ,  (yn, xn)  from  the  joint  distribution  (Y,  X),  condi¬ 
tional  on  (Y  <  X).  We  suppose,  for  simplicity,  that  there  are  no  ties  among  yi,y2,  •  •  •  ,2/m  and 
suppose  X  has  absolutely  continuous  distribution  function  coming  from  a  family  F(x;  z)  param¬ 
eterized  by  a  vector  z,  with  corresponding  survival  function  S(x\z)  =  1  —  F(x:  z )  and  density 
/(.x; z).  The  NPMLE  for  G  should  consist  of  (unknown)  point  masses  q\, .  ■ . , qn  placed  at  the 
points  yi,y2, . . . ,  yn-  The  logarithm  of  the  complete  likelihood  (4)  can  be  rewritten 


n 

log  (L)  =  ^2  [lo§  (/  (*<;  ^ ))  +  i°g  (*)]  -  nl°g 


n 

Y  s  (ft; £)  ft 


(8) 


13CC 


K.  M.  Boucher  and  R.  A.  Kerber 


If  we  factor  the  out  the  part  of  the  likelihood  corresponding  to  (6).  the  logarithm  is  given  by 

n 

log  (CL)  =  £  [log  (/  (.r,;  £ ))  -  log  (S  (y,:  £))] .  (9) 

7  —  1 

We  now  discuss  the  changes  which  must  be  made  when  censoring  and  additional  covariAtes  are 
present.  If  s  is  a  vector  of  additional  covariates,  A (x,s;z)  denotes  the  hazard  associated  with 
and  A (x,s;z)  the  cumulative  hazard,  we  note  that  (9)  becomes 

n 

log  (CL)  =  [log  (A  (xj ,  Sj ;  £))  -  (A  (x,,  s,:z)  -  A  (</,,£, :  £ ))] .  (10) 

7—1 

In  the  presence  of  right  censoring  which  is  independent  of  both  the  failure  and  truncation  times, 
Xj  is  replaced  in  the  above  formulation  by  the  minimum  of  the  failure  and  censoring  time.  The 
term  f(x,s:z)  in  the  likelihood  is  replaced  by  f(x,s:z)6S(x,s:z)l~6,  where  Sj  =  1  if  observa¬ 
tion  i  is  a  failure  and  Sj  =  0  otherwise,  and  the  conditional  likelihood  (C)  (with  .?•/,  s},  and  ijj 
regarded  as  fixed)  becomes 


cl  - 1 1 


/  (.T,.  .?,;£) 0  S  (xj,Sj\  z) 


( I  —6) 


S  O/mS,;- 


In  this  setting,  log  (CL)  becomes 


log  (CL)  =  5>log(A  (xhSj:z))  -  (A  (.r, -  A  . 

7-1 


(ii) 


In  the  subsequent  analysis,  we  choose  to  maximize  (11)  rather  than  the  full  likelihood.  In  a 
separate  paper,  we  will  show  that  this  is  equivalent  to  maximizing  the  full  likelihood,  under 
ap p  rop r  i  ate  concl  it  ions . 


2.4.  Spline  Models 

We  choose  to  model  the  hazard  via  quadratic  splines  as  in  [27].  A  quadratic  spline  with  m 
knots  specifies  the  hazard  to  be  of  the  form 


2  ni 

X>n  (t)  =  +  7>2  ~  TJ  )+  ’  (12) 

/=0  j=l 

where  (:;;)+  =  max(x’,0).  For  each  birth  cohort,  we  fit  splines  with  knots  which  were  equally 
spaced  in  the  interior  of  the  interior  [Tm\n,  Tinax],  where  rmill  is  the  minimum  truncation  age 
in  the  cohort  and  Xinax  the  maximum  follow-up  (failure  or  censoring)  time.  Restrictions  were 
placed  on  the  coefficients  to  ensure  that  A m(t)  remained  positive  for  all  t.  Thus,  with  m  knots  the 
number  of  parameters  was  m  +  3.  Models  were  fit  using  maximum  likelihood  techniques  applied 
to  the  conditional  likelihood,  as  given  by  (11). 

The  hazard  function  was  estimated  for  breast  cancer  incidence  (women  only)  and  for  colorectal 
cancer  incidence  (both  men  and  women).  The  spline  estimates  were  computed  by  maximizing 
log(CL)  using  the  algorithm  of  Powell  [28].  We  started  with  one  knot  and  increased  the  number 
of  knots  until  the  fit  was  not  improved,  as  determined  by  the  likelihood  ratio  test  at  the  signifi¬ 
cance  level  a  =  0.05.  Two  other  subcohort  estimates  of  the  hazard  function  were  computed  for 
comparison  with  the  spline  estimator;  a  life  table  version  of  (5).  and  a  Gaussian  kernel  estimate 
based  on  the  Nelson-Aalen  estimator  (7). 
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(a) 


(b) 


(c) 

Figure  1.  Four  estimates  of  the  age-specific  hazard  function  for  female  breast  cancer, 
stratified  by  birth  cohort:  a  spline  estimate  (labeled  “Spline” ),  a  kernel  estimate 
based  on  the  Nelson-Aalen  estimator  (labeled  “Kernel”),  a  life  table  estimate  (labeled 
“Life  Table”),  and  a  proportional  hazards  spline  estimate  using  all  strata,  with  birth 
year  as  sole  covariate,  set  at  the  stratum  mean  (labeled  “Combined  Spline”). 
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(d) 


(e) 


Figure  1.  (cont.) 


2.5.  Proportional  Hazards 

It  became  clear,  when  fitting  models  to  the  subcohorts,  that  there  was  a  birth  cohort  effect  in 
the  data.  At  the  same  time,  we  wished  to  have  estimates  of  the  hazard  for  the  entire  age  range  of 
34-100-h  years.  We  therefore  fit  proportional  hazards  models  with  splines  Am(£)  for  the  baseline 
hazard  and  a  single  covariate  s  representing  birth  year.  The  resulting  hazard  function  has  the 
form 

A  m{t,  s;  0)  =  exp(0s)Xm(t).  (13) 

The  model  was  again  fit  using  the  conditional  likelihood  of  the  form 

n 

log(CL)  =  ^  [<5,  log (Am  ( xusi\0 ))  -  (A (a:*, «,;/?)  -  A{yhsf,  0))} ,  (14) 

i= 1 

which  is  (11)  with  A(.x,s,  z)  =  A m(x.t,  sf,  f3). 

3.  RESULTS 

Estimates  of  the  age  specific  hazard  for  female  breast  cancer  are  presented  in  Figure  1  for 
the  1874-1889,  1890-1899,  1900-1909,  1910-1919,  and  1920-1931  birth  subcohorts.  Age  specific 
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(a) 


(b) 


(c) 

Figure  2.  Four  estimates  of  the  age-specific  hazard  function  for  female  colorectal  can¬ 
cer,  stratified  by  birth  cohort:  a  spline  estimate  (labeled  “Spline”),  a  kernel  estimate 
based  on  the  Nelson-Aalen  estimator  (labeled  “Kernel”),  a  life  table  estimate  (la¬ 
beled  “Life  Table”),  and  a  proportional  hazards  spline  estimate  using  all  strata,  with 
birth  year  as  sole  covariate,  set  at  the  stratum  mean  (labeled  “Combined  Spline”). 
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(d) 


Age  (years) 

(e) 

Figure  2.  (cont.) 

hazards  for  colorectal  cancer  are  presented  in  Figures  2  and  3,  stratified  by  birth  cohort  and 
gender.  Each  figure  presents  three  estimates  of  the  hazard  from  the  subcohort  alone,  namely 
the  life  table  estimate,  the  kernel  estimate  based  on  the  Nelson-Aalen  estimator  and  a  spline 
estimate,  as  well  as  one  gender-specific  estimate  from  a  proportional  hazards  model  with  birth 
year  as  covariate,  fit  to  data  from  all  birth  subcohorts  (1874-1931).  The  covariate  is  set  to  the 
mean  birth  year  of  the  subcohort.  We  note  that  approximately  40  years  of  follow-up  are  available 
for  any  one  subcohort,  as  follow-up  data  are  available  from  approximately  1905-1995. 

We  found  that  splines  with  very  few  knots  appeared  to  fit  the  data.  In  all  but  one  case,  two 
knots  were  sufficient  for  the  spline  estimates,  as  determined  by  the  likelihood  ratio  test,  and  in  the 
remaining  case  (breast  cancer,  birth  years  1874-1889),  one  knot  sufficed.  The  hazard  function  for 
both  breast  and  colorectal  cancer  appears  to  increase  monotonically,  at  least  until  the  age  of  85 
or  90,  when  the  subcohort  specific  estimates  of  the  hazard  estimates  for  women  for  both  breast 
and  colon  cancer  appear  to  flatten  or  decrease  while  the  estimate  for  men  appears  to  continue 
to  increase.  (In  each  of  the  three  cases,  the  proportional  hazards  model  provides  estimates  of 
the  hazard  function  which  increase  through  all  ages.)  We  also  note  that,  in  all  the  proportional 
hazards  models,  the  birth  cohort  effect  was  highly  significant  (p  <  0.0001).  We  also  see  from  the 
subcohort  analysis  that  the  proportional  hazards  assumption  appears  to  be  adequate,  at  least  up 
until  the  age  of  85  or  90,  when  proportionality  may  fail  for  women. 


» 
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(c) 

Figure  3.  Four  estimates  of  the  age-specific  hazard  function  for  male  colorectal 
cancer,  stratified  by  birth  cohort:  a  spline  estimate  (labeled  “Spline”),  a  kernel 
estimate  based  on  the  Nelson-Aalen  estimator  (labeled  “Kernel”),  and  a  life  table 
estimate  (labeled  “Life  Table”),  and  a  proportional  hazards  spline  estimate  using  all 
strata,  with  birth  year  as  sole  covariate,  set  at  the  stratum  mean  (labeled  “Combined 
Spline” ) . 
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Figure  3.  (cont.) 


Figure  4.  Comparison  of  the  age-specific  hazard  function  estimates  for  female  breast 
cancer  for  various  birth  cohort  strata  from  a  proportional  hazards  model  spline  model. 
Birth  year  covariate  set  at  the  mean  value  for  each  stratum:  1884.41  for  the  1874- 
1889  stratum,  1894.90  for  the  1890-1899  stratum,  1904.54  for  the  1900-1909  stratum, 
1914.52  for  the  1910-1919  stratum,  and  1925.24  for  the  1920-1931  stratum. 
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Figure  5.  Comparison  of  the  age-specific  hazard  function  estimates  for  female  colorec¬ 
tal  cancer  for  various  birth  cohort  strata  from  a  proportional  hazards  model  spline 
model.  Birth  year  covariate  set  at  the  mean  value  in  each  stratum:  1884.41  for  the 
1874-1889  stratum,  1894.90  for  the  1890-1899  stratum,  1904.54  for  the  1900-1909 
stratum,  1914.52  for  the  1910-1919  stratum,  and  1925.24  for  the  1920-1931  stratum. 


Age  (years) 

Figure  6.  Comparison  of  the  age-specific  hazard  function  estimates  for  male  colorectal 
cancer  for  various  birth  cohort  strata  from  a  proportional  hazards  model  spline  model. 

Birth  year  covariate  set  at  the  mean  value  in  each  stratum:  1884.74  for  the  1874- 
1889  stratum,  1895.06  for  the  1890-1899  stratum,  1904.74  for  the  1900-1909  stratum, 

1914.57  for  the  1910-1919  stratum,  and  1925.31  for  the  1920-1931  stratum. 

We  also  note  that  the  colorectal  cancer  risk  estimates  are  higher  for  men  than  for  women.  For 
example,  the  estimated  age  specific  yearly  hazard  for  the  1920-1931  birth  cohort  at  age  70  is 
approximately  0.0013  for  women,  and  about  0.0017  for  men,  or  about  30%  higher  for  men. 

The  estimated  hazards  from  the  proportional  hazards  models  over  a  seventy  year  range  are 
presented  in  Figures  4-6.  The  estimated  hazards  increase  as  the  birth  cohorts  become  more 
recent,  with  coefficient  estimates  of  (3  =  0.0347  (year-1)  for  female  breast  cancer,  (3  =  0.016 
(year”1)  for  female  colorectal  cancer,  and  (3  =  0.020  (year”1)  for  male  colorectal  cancer.  Thus, 
the  additional  hazard  for  more  recent  birth  cohorts  appears  to  be  more  pronounced  for  breast 
cancer  than  for  colorectal  cancer. 

4.  DISCUSSION 

As  noted  in  the  introduction,  the  presence  of  a  large  degree  of  heterogeneity  in  the  risk  for 
a  population  may  lead  to  a  decreasing  age  specific  hazard  function.  Since  we  see  little  or  no 
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evidence  of  a  decreasing  hazard  for  either  breast  or  colorectal  cancer  at  least  until  age  85  or  90, 
it  appears  that  the  risk  is  relatively  homogeneous  for  both  these  cancers  over  this  age  range.  In 
particular,  there  appears  to  be  little  evidence  for  a  high  immune  fraction  for  either  breast  or 
colorectal  cancer.  We  should  also  note  that  the  presence  of  a  monotone  increasing  hazard  over  a 
limited  range  does  not  completely  rule  out  heterogeneity.  The  data  is  quite  consistent  with  the 
degree  of  heterogeneity  that  might  result  from  known  cancer  genes,  as  long  as  the  risk  is  generally 
increasing  (at  least  through  age  90)  in  the  population  as  a  whole.  There  is  little  or  no  evidence  of 
an  inherited  component  to  the  risk,  as  a  large  inherited  component  might  be  expected  to  provide 
a  local  maxima  to  the  hazard  rather  early  in  life,  certainly  prior  to  age  85. 

One  may  extend  the  more  general  two-stage  model  of  carcinogenesis  presented  in  the  introduc¬ 
tion  to  take  cell  death  into  account,  by  adding  a  Poisson  process  of  cell  death  which  competes 
with  the  process  of  malignant  transformation,  as  suggested  by  Yakovlev  and  Polig  [29].  This 
model  has  been  successfully  applied  to  data  from  radiation  induced  and  chemically  induced  le¬ 
sions  [30-32].  With  the  cell  death  component,  it  becomes  less  clear  that  the  hazard  function 
should  increase  monotonically  in  the  case  of  spontaneous  carcinogenesis.  In  fact,  in  the  simplified 
case  of  constant  rates  v\  of  initiation  and  v2  of  cell  death,  and  arbitrary  cumulative  distribution 
function  F(t)  for  time  to  transformation  of  intermediate  lesions,  the  hazard  function  for  time  to 
tumor  has  the  form 

A (t)  =  V\  exp  (-i/20  F(t)-  (lb) 

We  note  that  according  to  this  model  the  clock  for  cell  death  in  this  model  starts  at  birth.  If 
the  constant  v2  >  0  in  (15),  then  A (t)  must  decrease  exponentially  since  F(t)  approaches  one 
as  t  approaches  infinity.  We  conjecture  that  in  the  present  context  the  cell  death  component  is 
very  small,  so  that  it  does  not  dominate  A (t)  until  after  age  85.  The  higher  hazard  rate  for  male 
colorectal  cancer,  as  well  as  the  continued  increase  in  hazard  through  age  105,  may  be  attributed 
to  a  smaller  rate  of  cell  death.  Another  possibility  is  that  the  cell  death  should  not  be  measured 
from  birth,  but  from  formation  of  the  initiated  cell  (as  in  another  variation  of  the  model  suggested 
in  [29]). 

We  have  noted  in  the  Results  section  that  proportionality  of  hazard  appears  to  fail  after  age  90 
for  both  breast  and  colorectal  cancer  in  women.  This  result  may  be  due  to  sampling  variability, 
or  additional  bias  unique  to  women  at  these  high  ages.  We  note  that,  there  are  only  110  female 
breast  cancer  cases  and  77  female  colorectal  cancer  cases  after  age  90.  They  are  distributed  over 
a  fifteen  year  period,  for  an  average  of  7.7  breast  cancer  and  5.1  colorectal  cancer  cases  per  year 
in  this  range.  In  addition,  data  linkage  is  more  difficult  for  women,  who  are  more  likely  to  have 
changed  names  than  men.  An  additional  indication  that  the  lack  of  proportionality  for  women 
may  be  spurious  is  that  we  do  not  see  this  apparent  lack  of  proportionality  in  men. 
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Background  Several  measures  of  familial  disease  aggrega¬ 
tion  have  been  proposed,  but  only  a  few  of  these  are 
designed  to  be  implemented  at  the  individual  level.  We 
evaluate  two  of  them  in  the  context  of  breast-cancer 
incidence. 

Methods  A  population-based  cohort  consisting  of  114  429 
women  born  between  1874  and  1931  and  at  risk  for 
breast  cancer  after  1965  was  identified  by  linking  the 
Utah  Population  Data  Base  and  the  Utah  Cancer  Reg¬ 
istry.  Two  competing  methods  were  used  to  obtain  pre¬ 
dictors  of  familial  aggregation  of  risk:  the  number  of 
first-degree  relatives  with  breast  cancer  (NIST)  and  the 
familial  standardised  incidence  ratio  (FSIR),  which 
weights  the  disease  status  of  relatives  based  on  their 
degree  of  relatedness  with  the  proband.  Relative  risks 
were  estimated  using  Mantel-Haenszel,  Poisson  regres¬ 


sion  and  spline  regression  methods.  The  age-dependent 
hazard  function  was  also  estimated. 

Results  Compared  to  a  baseline  category  containing  91.5% 
of  the  subjects,  the  0.7%  of  subjects  identified  as  high 
risk  using  the  FSIR  criterion  had  a  relative  risk  of  about 
2.8,  while  those  identified  as  high  risk  using  the  NIST 
criterion  had  a  relative  risk  of  2.0.  Moderate-risk  sub¬ 
jects  had  a  relative  risk  of  about  1.75  using  either  criteri¬ 
on.  FSIR  was  a  significant  predictor  of  risk  even  for 
those  with  no  affected  first-degree  relatives.  No  decline 
in  the  baseline  risk  was  observed  at  advanced  ages. 

Conclusions  FSIR  appears  to  be  a  better  predictor  of 
breast-cancer  risk  than  NIST,  particularly  for  high-risk 
subjects. 

Keywords  familial  risk,  hazard  function,  truncation,  sur¬ 
vival  analysis,  breast  cancer. 


Introduction 

Heterogeneity  in  a  population  may  lead  to  population 
estimates  of  the  hazard  that  do  not  reflect  individual 
risk.  For  example,  if  we  let  X{t)  denote  the  hazard  func¬ 
tion,  and  p  the  probability  of  immunity  to  a  particular 
disease,  it  follows  from  the  formula: 


that  there  are  individuals  who  are  ‘immune’  in  the  popu¬ 
lation  exactly  when  the  hazard  function  has  finite  inte¬ 
gral.  In  particular,  lim^^f)  =  0,  provided  the  limit 
exists.  More  generally,  a  large  degree  of  heterogeneity  in 
disease  susceptibility  may  lead  to  a  population  hazard- 
function  with  one  or  more  well-defined  maxima.  The 
maxima  may  correspond  to  discrete  subpopulations  with 
different  genetic  predisposition  to  disease.  A  maximum 
may  also  result  from  a  continuous  frailty,  as  the  surviv¬ 
ing  population  at  higher  ages  may  be  over-represented 
by  individuals  with  lower  risk1. 

There  is  evidence  of  heterogeneity  for  most  cancers. 
According  to  Easton2,  ‘All  cancer  types  exhibit  familial 


clustering,  suggestive  of  a  significant  inherited  compo¬ 
nent’.  He  goes  on  to  conclude  that,  as  of  1994,  known 
cancer  genes  accounted  for  0.5-1%  of  all  cancer  cases, 
and  that  this  figure  would  increase  as  more  cancer  genes 
are  discovered.  The  breast-cancer  genes  BRCA1  and 
BRCA2  both  contribute  to  an  increased  risk  of  breast 
cancer.  BRCA1  has  an  estimated  allele  frequency  of 
0.0002-0.001  [95%  confidence  interval  (Cl)]3,  and 
accounts  for  about  3%  of  diagnosed  breast  cancer4.  The 
allele  frequency  of  mutations  in  BRCA2  is  estimated  at 
0.000225.  Vehmanen  et  al6  found  that  only  21%  of 
breast-cancer  families  were  accounted  for  by  mutations 
of  BRCA1  and  BRCA2 ,  providing  indirect  evidence  for 
the  existence  of  other,  undiscovered  breast-cancer 
genes. 

In  our  previous  paper7,  linked  population-based  data 
from  the  Utah  Cancer  Registry  (UCR)  and  the  Utah 
Population  Data  Base  (UPDB)  was  used  to  estimate  the 
population-level  hazard  function  for  breast  and  colorec¬ 
tal  cancer,  stratified  by  birth  cohort.  We  found  that  the 
hazard  functions  for  both  breast  and  colorectal  cancer 
appeared  to  be  monotone  increasing  functions  for  both 
genders  and  all  birth  cohorts.  This  contrasts  with  the 
model-based  estimates  of  Moolgavkar  et  al .8,  who 
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found  the  hazard  function  to  decrease  sharply  starting 
sometime  past  the  age  of  70.  The  lack  of  clear  multiple 
modes  in  the  hazard  function  highlighted  the  fact  that 
more  delicate  methods  would  be  needed  to  account  for 
the  known  heterogeneity  of  risk. 

A  number  of  measures  of  familial  disease  aggrega¬ 
tion  have  been  used  or  proposed,  but  only  a  few  of  these 
are  designed  to  be  implemented  at  the  individual  level. 
The  most  common  epidemiological  measure  of  familial 
risk  is  an  indicator  of  whether  one  or  more  first-degree 
relatives  have  been  diagnosed  with  cancer,  or  some 
other  disease.  Khoury  and  Flanders9  have  noted  that 
measures  of  this  sort  are  prone  to  bias  under  a  variety  of 
conditions.  Nonetheless,  it  is  a  widely  used  and  easily 
understood  measure  of  familial  risk  that  can  easily  be 
ascertained  in  a  clinical  setting.  A  second  category  of 
family-history  measures,  suggested  by  Kerber10,  is 
derived  from  the  complete  risk  experience  of  all  observ¬ 
able  biological  relatives,  adjusted  for  the  age,  sex,  num¬ 
ber  and  degree  of  the  relatives.  The  total  familial  risk  is 
summarised  as  a  familial  standardised  incidence  ratio 
(FSIR),  or  a  familial  rate  (FR).  FSIR  and  FR  are  less 
prone  to  bias  and  substantially  more  sensitive  than  a 
crude  indicator  variable,  but  require  fairly  detailed  fami¬ 
ly  history  data,  which  may  rarely  be  available  in  a  clini¬ 
cal  setting. 

In  this  paper,  age-specific  estimates  of  the  relative 
risk  (RR)  for  breast  cancer  are  obtained  using  three  dif¬ 
ferent  methods,  with  the  number  of  affected  first-degree 
relative  (NIST)  and  FSIR  as  predictors  of  risk.  We  show 
that  FSIR  performs  better  in  identifying  subjects  at  high 
risk.  As  a  by-product  of  the  analysis  we  find  that  the 
risk  for  breast  cancer  appears  to  be  increasing  as  a  func¬ 
tion  of  age,  even  at  advanced  ages. 

Data 

The  data  for  this  study  were  obtained  by  linking  records 
from  the  UPDB  with  the  UCR.  The  UPDB  consists  of 
the  genealogical  records  of  more  than  1  000  000  indi¬ 
viduals  who  were  born,  died,  or  married  in  Utah,  or  en 
route  to  Utah,  during  the  nineteenth  and  twentieth  cen¬ 
turies.  The  available  follow-up  information  comes  either 
from  Utah  death  certificates,  which  have  been  linked  to 
the  UPDB  genealogical  data  every  year  from  1933 
through  the  beginning  of  1997,  or  from  linkage  of  the 
HCFA  beneficiary  data  to  the  UPDB.  The  study  popula¬ 
tion  consisted  of  122  208  women  recorded  in  the  Utah 
Population  Database,  who  were  born  during  1874-1931 
and  for  whom  follow-up  information  was  available  that 
places  them  in  Utah  during  the  years  of  operation  of  the 
Utah  Cancer  Registry  (1966-present).  Subjects  with 
purported  follow-up  past  age  105  were  excluded  from 
the  data.  Potential  subjects  who  had  no  relatives  who 


were  also  in  the  risk  set,  and  therefore  for  whom  no 
measures  of  familial  aggregation  could  be  computed, 
were  removed  from  the  data.  Excluding  these  two 
groups  removed  an  additional  7779  women,  leaving  a 
study  population  of  114  429  women.  There  are  5092 
cases  of  female  breast  cancer  in  the  data.  Only  female 
breast  cancer  was  analysed.  Additional  details  on  the 
data  are  given  in  Boucher  and  Kerber7. 

Methods 

Number  of  first-degree  relatives 

The  simplest  and  most  easily  understandable  measure  of 
familial  aggregation  is  the  number  of  affected  NIST.  Of 
the  114  429  women  in  the  data  set,  9765,  or  approxi¬ 
mately  8.5%,  had  at  least  one  NIST  with  breast  cancer 
and  795  women,  or  0.69%,  had  two  or  more  affected 
NIST.  Having  more  than  two  NIST  with  breast  cancer 
was  extremely  rare:  56  women  had  three,  and  10  women 
had  the  maximum  of  four.  For  this  reason,  subjects  with 
two  or  more  affected  relatives  were  combined  for  data 
analysis.  Table  1  gives  the  distribution  of  number  of 
cases  and  person-years  of  risk,  stratified  by  N 1  ST,  age, 
and  birth-year.  Mantel-Haenszel  RR  and  95%  Cl  are 
also  presented. 

Familial  standardised  incidence  ratio 

The  second  measure  of  familial  aggregation  we  used 
was  a  modification  of  the  FSIR10.  The  FSIR  is  derived 
from  the  complete  risk  experience  of  all  observable  bio¬ 
logical  relatives,  adjusted  for  age,  sex,  number  and 
shared  inheritance  with  the  subject.  Formally,  FSIR  is 
defined  in  terms  of  the  kinship  coefficient1 1  c(ij) 
between  individuals  i  and  y,  which  gives  the  probability 
that  two  individuals  share  randomly  selected  genes  at  a 
given  locus  by  common  descent.  The  kinship  coefficient 
is  defined  by 

C(ij)  =  (1/2)  X),2"/</,) 

where  /U  is  the  total  number  of  distinct  shortest  paths 
through  a  common  ancestor  between  individuals  /  and  j, 
and  l(p)  is  the  length  in  number  of  reproductive  events 
of  the  path  p. 

A  simple  case  is  the  one  in  which  the  subjects  with 
indices  i  and  j  are  full  siblings.  In  this  case,  there  are 
two  paths  of  length  two  between  the  subjects,  one 
through  the  father  and  one  through  the  mother,  and  c(ij) 
-  ( l/2)(2”2  +  2“2)  =  0.25.  For  another  common  exam¬ 
ple,  consider  the  case  in  which  the  individuals  labeled  i 
and  j  are  first  cousins.  In  the  most  typical  case,  there  is 
exactly  one  pair  of  shared  grandparents,  with  the  other 
pairs  of  grandparents  distinct  and  unrelated  to  each 
other  or  the  shared  pair.  In  this  case,  there  are  two  paths 


MEASURES  OF  FAMILIAL  AGGREGATION  AS  PREDICTORS  OF  BREAST-CANCER  RISK 


379 


<5  vo 

o 

CN 

t> 

On 

T— 1 

Tt- 

on 

ON 

CO 

^t- 

ft  o 

vo 

O 

On 

on 

on 

in 

<N 

l> 

i-H 

m 

ft  <N 

i 

00 

cd 

cd 

cd 

cd 

in 

r*H 

I 

i 

cd 

p  vo 

r-H 

u 

t-H 

oo 

o 

m 

00 

O 

00 

r- 

in 

o 

ON 

?  vo 

r-H 

00 

no 

T-H 

CO 

vq 

o 

in 

3d 

i 

H 

d 

o 

d 

r- H 

H 

o 

l 

i 

r-H 

o 

o 

vo 

m 

co 

<N 

in 

CO 

On 

r-H 

o 

o 

o 

in 

vo 

00 

CO 

o 

of; 

o 

r-H 

o 

o 

q 

vd 

o 

cd 

d 

T-H 

r-H 

(N 

c4 

<N 

cd 

d 

d 

o 

cd 

oooN^oiotN^tcocnr-'OooocN 

coo^tinovoi-<Ncoincovoooo 


00  o  CO  O 


(NCNCScnOOOCN 


^OOI-OO^OUJO^OOh 

tNootoinoNO\in^(N^u 

^(NMUMCNHH 


h  h  ^  T  (N  O 


O  O  <N 

l> 


00  O  On  O 
,  SO  VO  <N  CO 


ftOco(Nco<N<N<N 

p  ^ 


cv  O  vo  ON  co  <N  co 

^  CO  OO  (N  rH  OO  Tt 
(N  (N  N  N  H  ro 


®  *H 

A?  ^oocin^c^Hoofn^mooin 
>  cq  oo  on  i>  <N  in  cn  in  <N  t--  o 
— i  i-h  <t-h  o 


fnioooo>HioHH\o^GN^o>o 
t  ro  ^  u  ^  tr,  oo  'C  -h  oq  n  r^ 

CO  t— (  ^H  (S)  ^H  ’“H  r>H  f-H  r-H  ^H  ^H  i— H  CO  r-H 


oooin\Din^OMr)0\h^ooh 


^nOH^^^oohooh^ttoo 

r~-vocooNinin<NOOvoi>C'-xi'vo 

(Nocin  —  r-^<Nooc\i^r-c\i  in 


N  (N  m  n  M  <n 


V0  o  ^  O  O  ON 


ro  >n  c^i  in  n  >n  ro 


(NmvooocooovorN 


ONVooof>ONcor-ONoo\[^Tj- 

ocoONOONcoinoococoovr'- 

inoococoOcoooovO^Or- 

’-OTXt^i^oOhTOCin 

cnooin^hN'tH^oN^^t 

HMcsmcnroMHH 

riinoo^TUOT^Oh^ 

^HininONooincoTf-Tfr-voov 

-^minLh^D^M 


ON'tO^tON^ON'^ON'tON'^t 
cntt^m^^Uhooooo\  , 

I  1  T  I  I  I  I  I  I  I  I  I  +  — 
inoinOinoinOinoinomoH 
co^j-^ininvovor^r^oooooNON< 


380 


KM  BOUCHER  AND  RA  KERBER 


of  length  four  between  individuals  /  and  j  and  c(ij)  = 
(l/2)(2-4+2-4)  =  1/16.  The  relevant  reproductive  events 
in  this  case  are  the  ones  corresponding  to  subjects  i  and 
j ,  and  the  reproductive  events  for  the  parent  of  subject  / 
and  the  parent  of  subject  j  that  are  siblings.  Finally,  we 
suppose  that  we  have  a  stratified  population;  the  number 
of  strata  is  K ,  the  population  incidence  in  the  Alh  stratum 
is  given  by  Xk,  and  let  tjk  be  the  time  that  the  jth  person 
spent  in  the  Ath  stratum  of  risk.  Let  /  =  1  if  the  jth 
member  has  the  disease  and  0  otherwise.  The  FSIR  is 
then  defined,  for  the  tth  individual,  by: 

YjljC(iJ) 

FSIR .  =  -  (2) 

ZZ  tjkhc^3) 

k=\  jli 

The  sum  for  the  index  j  runs  over  all  related  individuals 
in  the  pedigree  (excluding  subject  /). 

In  deriving  a  measure  of  variance  VARi  for  FSIRr  it 
was  assumed  that  the  denominator  of  the  above  expres¬ 
sion  is  fixed,  and  that  for  each  fixed  path  length  the 
number  of  observed  cases  follows  a  Poisson  distribution 
with  mean  equal  to  the  expected  number  of  cases  in  the 
stratum.  For  this  study,  risk  was  stratified  by  age  and 
gender. 

A  difficulty  with  using  the  ‘raw’  FSIR  scores  is  that 
the  amount  of  information  from  which  it  is  constructed 
for  a  particular  individual  is  highly  variable.  A  low 
FSIR  score  could  be  an  indicator  of  low  risk,  or  simply 
reflect  a  small  family  size.  We  therefore  chose  to  adjust 
the  scores  using  an  empirical  Bayes  procedure  before 
incorporating  them  into  a  regression  analysis.  As  the 
raw  FSIR  scores  are  highly  skewed,  we  applied  the 
empirical  Bayes  procedure  to  log(l  +  log(l  +FS1R)). 
The  basic  assumption  of  the  empirical  Bayes  adjustment 
is  that  the  ‘true'  values  of  log(l  +log(l  -hFSIRjf)  are 
normally  distributed.  The  mean  and  variance  of  p  are 
estimated  empirically  and  iteratively  from  the  data.  The 
procedure  we  use  is  similar  to  the  one  suggested  by 
Greenland  and  Robins12. 

More  specifically,  we  suppose  that  after  iteration 
72-1  we  have  current  estimates  Fin-\  anc*  °in-\  f°r 
true  value  of  log(l  +log(l  +FSIR))  for  the  ith  individual, 
as  well  as  an  overall  mean  pn_x  and  variance  an_x  for 
the  jur  We  then  computed  new  estimates  using  the 
formula 


A,>=/Vi 


H-l 


+  a2  , 
1 


O' -A,-.)  (3) 


where  F=  log(l+log(l +FSIR()),  with  variance  estim¬ 
ated  by: 


VAR. 

a2  = - ! -  (4) 

(exp(//,.n_1)exp(exp(//.H_1)-1))2 

given  by  the  delta  method.  We  then  computed  the  sam¬ 
ple  mean  and  variance  of  ///;},  over  all  the  subjects  to  get 
pn  and  a2.  The  iteration  was  repeated  until  the  values 
stabilised. 

The  distribution  of  log(l  +  log(l  +FSIR)),  before  and 
after  transformation,  is  displayed  in  Figure  1.  Note  that 
the  ‘raw’  distribution  is  bimodal,  with  a  mode  at  zero 
that  disappears  after  transformation. 


Log  (1+(log(1+FSIR)) 


Transformed  Log  (1  +(log(1  +FSIR)) 

Fig .  J  The  distribution  of  log(l  +  log( J  +  FSIR))  before  (A)  and 
after  (B)  empirical  Bayes  adjustment. 


The  inverse  of  the  transformation  log(lFlog(l  +FSIR)) 
was  then  used  to  adjust  FSIR.  We  then  divided  the 
adjusted  FSIR  into  ‘Low’,  ‘Medium’  and  ‘High’  risk 
categories,  containing  the  same  fraction  of  the  data  as 
the  corresponding  categories  of  NIST.  Table  2  gives  the 
distribution  of  cases  and  person-years,  stratified  by 
adjusted  FSIR,  age  and  birth-year.  Mantel-Haenszel  RR 
and  95%  Cl  are  also  presented. 
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Regression  methods 

Two  regression  methods  were  used  to  analyse  the  data. 
The  first  was  a  standard  Poisson  regression  method, 
using  the  grouped  data  presented  in  Tables  1  and  2.  The 
data  was  further  stratified  into  5 -year  birth-year  inter¬ 
vals.  Let  d. . ,  and  n. . ,  be  the  number  of  cases  and 
person-years  of  risk  for  stratum  where  i  indexes 

age,  j  indexes  birth-year,  and  k  indexes  familial  risk 
(either  NIST  or  adjusted  FSIR).  Let  BY.  be  the  midpoint 
of  the  yth  birth-year  category.  The  final  Poisson  regres¬ 
sion  models  took  the  form: 

]o&(dij,k>  =  lo^nij,k> + ai+P[Q%(BY) + yk  (5) 

A  model  was  also  considered  with  interaction  terms  rj.  k 
between  age  and  familial  risk.  The  interaction  terms 
were  found  to  be  insignificant  both  with  NIST  (j 2 
=  31.14  on  22  degrees  of  freedom,  p  =  0.093)  and  with 
adjusted  FSIR  (% 2  =  20.68  on  23  degrees  of  freedom,  p 
=  0.60)  as  measures  of  familial  risk.  We  therefore 
chose  for  simplicity  to  leave  the  interaction  term  out  of 
our  final  model. 

Since  individual  level  data  were  available  for  this 
study,  survival  analysis  methods  were  appropriate,  and 
potentially  more  efficient  than  methods  for  aggregate 
data.  In  addition,  it  was  possible  to  handle  predictors  in  a 
truly  continuous  fashion,  and  obtain  a  smooth  estimate  of 
the  age-dependent  risk.  We  wished  for  the  analysis  to  be 
essentially  nonparametric,  we  therefore  modelled  the 
hazard  via  a  quadratic  spline13.  A  quadratic  spline  takes 
the  form  of  a  polynomial  of  second  degree  between  suc¬ 
cessive  break  points  t.,  called  ‘knots’.  A  quadratic  spline 
with  m  knots  specifies  the  hazard  to  be  of  the  form: 

A.W  =  X  V1'  +  X^  _  T?l  (6) 

/=0  j=\ 

where  (x)+  =  max(/,0).  The  spline  is  a  continuous  func¬ 
tion  with  continuous  first  derivative. 

We  fit  splines  with  knots  that  were  equally  spaced  in 
the  interior  of  the  interval  [T  .  ,T  ],  where  T  .  is  the 

L  mm’  max■,,  min 

minimum  age  of  any  subject  in  1965,  and  T  the  maxi¬ 
mum  follow-up  (failure  or  censoring)  time.  Thus,  with 
m  knots  the  number  of  parameters  was  m+3.  As  the 
number  of  knots  increases,  the  fit  becomes  increasingly 
nonparametric.  Since  the  number  of  knots  was  not  speci¬ 
fied  in  advance,  the  analysis  was  nonparametric  in  char¬ 
acter.  The  data  described  are  subject  to  random 
truncation:  cases  that  occurred  during  or  before  1965  are 
not  recorded  in  the  dataset.  Subjects  were  between  the 
ages  of  34-86  at  the  time  of  truncation.  In  the  more 
familiar  clinical  setting  in  which  survival  analysis  is 
used,  time  is  set  to  zero  for  all  patients  at  the  time  of 
diagnosis  or  treatment,  so  truncation  is  not  an  issue.  The 


truncation  is  taken  into  account  in  Poisson  regression 
and  in  the  calculation  of  FSIR  scores,  by  appropriately 
adjusting  the  number  of  subjects  in  the  risk-set.  Thus, 
analysis  of  the  data  must  take  into  account  not  only  the 
effects  of  right  censoring,  but  also  the  effects  of  left 
truncation  due  to  delayed  entry  into  the  risk-set.  The 
truncation  was  handled  by  using  a  conditional  likeli¬ 
hood;  conditioning  on  the  event  that  the  age  at  breast 
cancer  was  greater  than  the  age  at  which  as  subject 
entered  the  risk-set.  The  spline  regression  models  all 
contained  the  logarithm  of  birth-year  as  a  continuous 
covariate,  in  addition  to  a  measure  of  familial  risk. 

More  specifically,  let  X  be  the  observed  failure  or  cen¬ 
soring  age,  let  Fbe  the  age  at  truncation  and  let<5  be  a  censor¬ 
ing  indicator  (<5=1  if  the  failure  is  observed  and  <5=0 
otherwise).  Let  s~*  be  a  vector  of  covariates  and  z*  a  vector 
of  parameters.  Let  X(x,s~*;z*)  denote  the  hazard  associated 
with  the  failure  time-distribution  F(x,s~*;z*)  and  A(x, 
s~*;z~*)  the  cumulative  hazard  for  F(x,s~*;z~*).  Let  i  index  the 
subjects.  The  likelihood,  conditional  on  X  >  Y,  becomes: 

log  (CL)  = 

X^/  logU-Cfy  ?<))  ~  (M*;,  3-;  ?,)  -  a (y.,  7-  z*))]  (7) 

i 

The  proportional  hazards  model  2(x,?;z)=exp(^rz)20(x) 
was  used  to  model  covariate  effects  (the  logarithm  of 
birth  year  and  family  history). 

The  spline  estimates  were  computed  by  maximizing 
the  logarithm  of  the  conditional  likelihood  using  the 
algorithm  of  Powell14.  We  started  with  one  knot  and 
increased  the  number  of  knots  until  the  fit  was  not 
improved,  using  the  log-likelihood  as  a  guide.  This 
occurred  when  there  were  three  knots.  We  thus  used 
three  knots  for  all  the  results  in  the  following  section. 

Results 

Estimates  of  RR 

Tables  3  and  4  present  estimates  of  the  RR  and  95%  Cl 
for  ‘Moderate’  and  ‘High’  risk  categories  of  NIST  and 
FSIR,  compared  with  the  ‘Low’  risk  category,  using 
each  of  the  three  methods  presented  in  the  previous  sec¬ 
tion.  The  Mantel-Haenszel,  Poisson,  and  spline-based 
estimates  are  in  remarkably  good  agreement.  We  see 
that  the  ‘Moderate’  risk  categories  of  both  NIST  and 
FSIR  confer  an  approximate  1. 75-fold  increased  risk. 
The  ‘High’  risk  category  of  NIST  confers  an  approxi¬ 
mate  2.0-fold  increased  risk,  while  the  ‘High’  risk  cate¬ 
gory  of  FSIR  confers  an  approximate  2.8-fold  RR.  FSIR 
appeared  to  identify  subjects  with  higher  than  the  ‘Mod¬ 
erate’  level  of  risk.  The  ‘High’  and  ‘Moderate’  risk  cate¬ 
gories  of  NIST,  on  the  other  hand,  have  almost  the  same 
estimated  risk.  As  judged  by  the  length  of  the  Cl,  there 


Table  2  Distribution  of  cases  (C)  and  person-years  (PY)  stratified  by  age  and  FSIR 
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Table  3  RR,  95%  Cl,  and  y2  values  for  NIST,  computed  using  the  Mantel-Haenszel,  Poisson  regression  and  a  spline 
regression  method 


Category 

Model 

Mantel-Haenszel 

Poisson 

Spline 

Estimate 

MR/2 

Estimate 

Wald  x2 

Estimate  LR  % 2 

NIST  =  0 

1 

- 

1 

- 

1 

NIST  =  1 

1.75  (1.61-1.89) 

13.80 

1.76(1.62-1.90) 

186.91 

1.75(1.61-1.90) 

NIST  =  2+ 

2.01  (1.59-2.54) 

6.00 

2.04(1.61-2.57) 

35.82 

2.03  (1.58-2.56)  186.92: 

See  the  text  for  details.  All  the  methods  adjust  for  birth-year  and  age  and  have  the  lowest  level  as  reference  category. 
aTwo  df  likelihood  ratio  y2  f°r  inclusion  of  both  (NIST  =  1)  and  (NIST  =  2). 


Table  4  RR,  95%  Cl,  and  y2  values  for  FSIR,  computed  using  the  Mantel-Haenszel,  Poisson  regression  and  a  spline 
regression  method 


Category 

Mantel-Haenszel 
Estimate  MH  y2 

Model 

Poisson 

Estimate 

Wald  x2 

Spline 

Estimate  LR  y2 

Low  FSIR 

1 

- 

1 

- 

1 

Moderate  FSIR 

1.74(1.60-1.89) 

13.92 

1.67(1.65-1.93) 

188.77 

1.78  (1.65-1.93) 

High  FSIR 

2.89  (2.35-3.56) 

10.05 

2.83(1.17-3.46) 

93.99 

2.83  (2.27-3.46)  227.02a 

See  the  text  for  details.  All  the  methods  adjust  for  birth-year  and  age  and  have  the  lowest  level  as  reference  category. 
aTwo  df  likelihood  ratio  y2,  or  inclusion  of  both  ‘Moderate  FSIR’  and  ‘High  FSIR’. 


appears  to  be  little  gain  in  efficiency  for  estimation  of 
the  RR  from  the  regression  method  using  splines. 

More  than  90%  of  subjects  have  no  affected  first- 
degree  relatives,  and  thus  must  be  lumped  together  in 
the  ‘Low’  risk  category.  To  explore  the  affect  of  this  cat¬ 
egorisation  further,  we  fit  a  model  with  FSIR  treated  as 
a  continuous  predictor,  as  well  as  log(birth-year),  to  the 
subset  consisting  only  of  those  subjects  with  NIST  =  0. 
For  comparison,  we  fit  a  similar  model  to  the  entire 
dataset.  FSIR  was  a  highly  significant  predictor  of  risk, 
even  when  restricted  to  the  subjects  with  no  affected 
first-degree  relative,  with  significance  p  <  0.00001  as 
measured  by  the  likelihood  ratio  test  (with  the  logarithm 
of  birth-year  also  in  the  reference  model).  The  parame¬ 
ter  estimate  for  FSIR  from  the  restricted  data  was  1.11 
(95%  Cl  0.91-1.31).  This  model  corresponds  to  an 
approximate  three-fold  range  of  risk  in  the  subjects  with 
NIST  =  0.  By  comparison,  the  parameter  estimate  for 
the  entire  data  set  was  1.45  (95%  Cl  1.29-1.62),  which 
corresponds  to  an  approximate  5-fold  range  of  risk  over 
the  entire  dataset.  Thus,  there  appears  to  be  a  significant 
range  of  familial  risk  among  the  subjects  with  no  affec¬ 
ted  first-degree  relatives.  The  estimated  range  of  risk  is 
higher  using  FSIR  as  a  continuous  variable. 

Estimation  of  the  age-dependent  risk 

Estimates  of  the  age-dependent  risk  for  breast  cancer  for 
the  1901-05  birth  cohort  are  presented  in  Figure  2.  The 


estimates  for  the  ‘Low5  level  of  risk  using  Nlst  and 
FSIR  are  essentially  identical.  The  estimates  from 
Poission  regression  and  the  spline  model  are  in  good 
agreement.  Figure  3  shows  the  effect  of  birth-year  on 
the  ‘Low’  category  of  risk,  using  only  the  spline  models 
containing  FSIR.  The  crude  estimates  of  risk  are  also 
presented  in  Figure  3  for  comparison.  Adding  one  to  the 
birth-year  corresponded  to  an  approximate  2.7% 
increase  in  risk  (95%  Cl  2.4-3. 1%),  with  FSIR  in  a 
Poisson  regression  model.  Using  the  spline  model  with 
FSIR  the  estimated  increase  in  risk  per  year  is  3.5% 
(95%  Cl  3.2-3.9%).  Some  of  this  difference  may  be  due 
to  the  slightly  different  way  in  which  birth-year  was 
incorporated  into  the  models. 

The  crude  estimate  shows  a  decrease  in  risk  near  ages 
50-55  that  all  but  disappears  after  adjusting  for  birth- 
year.  There  is  also  a  less  pronounced  decrease  in  the 
crude  risk  near  age  90  that  also  disappears  after  adjust¬ 
ment  for  birth-year.  There  is  little  evidence  for  a 
decrease  in  risk  up  to  the  age  of  95  or  100.  More  recent 
birth  cohorts  appear  to  be  at  significantly  increased  risk 
compared  to  earlier  birth  cohorts. 

Discussion 

We  have  applied  two  methods  of  measuring  familial 
aggregation  at  the  individual  level  to  breast  cancer  data. 
As  might  be  expected,  both  prove  to  be  highly  signifi¬ 
cant  predictors  of  individual  risk  of  breast  cancer.  FSIR 
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Fig.  2  The  age-dependent  risk  for  breast  cancer  for  the 
] 90 1-0 5  birth  cohort ,  estimated  from  Poisson  regression  and  a 
spline  regression  model.  Estimates  are  presented  using  NIST 
(A)  and  FSIR  (B)  as  predictors  of  familial  risk. 

appeared  to  perform  better,  however,  at  identifying 
small  fractions  of  subjects  at  the  highest  risk.  This  also 
may  not  be  surprising,  since  FSIR  is  adjusted  for  the 
expected  number  of  cases  in  the  relatives  of  the  subject. 

In  addition,  by  performing  a  restricted  analysis  using 
FSIR  as  a  predictor  in  the  subset  of  subjects  with  no 
affected  first-degree  relatives,  we  see  that  N 1  ST  does 
not  appear  to  contain  enough  information  to  separate  out 
individuals  at  increased  risk.  Thus,  a  measure  that  incor¬ 
porates  the  observed  risk  of  a  wider  class  of  relatives 
seems  warranted. 

Methods  for  aggregate  data,  such  as  the  Poisson  regres¬ 
sion  method,  perform  remarkably  well  compared  to  the 
spline  regression  we  used,  as  long  as  the  data  is  cate¬ 
gorised  similarly  in  the  models.  Any  advantage  for  the 
spline  model  appears  to  be  in  the  ability  to  treat  variables 
in  a  truly  continuous  way.  These  advantages  allow  for 
more  complicated  modeling  strategies,  not  exploited  in 
the  present  paper,  such  as  accelerated  failure  time  models. 

As  an  interesting  by-product  of  the  analysis,  we  find 
that,  with  the  exception  of  a  possible  slight  decrease  in 
risk  between  the  ages  of  50  and  60,  the  risk  for  breast 


Fig.  3  The  age-dependent  risk  for  the  lowest  level  of  FSIR , 
estimated  for  several  birth  cohorts  using  a  spline  regression 
model.  The  crude  estimate  from  the  tabulated  data ,  which  is 
not  adjusted  for  birth -year,  is  also  presented  for  comparison. 

cancer  appears  to  be  non-decreasing  through  age  95. 
Thus,  we  find  no  evidence  of  an  ‘immune  fraction’  for 
breast  cancer.  First,  the  lowest  level  of  N 1  ST  appears  to 
have  a  high  degree  of  variability  of  familial  risk.  Sec¬ 
ondly,  the  age-dependent  risk  is  the  lowest  category  of 
NIST  or  FSIR  and  shows  no  evidence  of  a  peak  in  risk 
followed  by  a  drop,  which  would  be  the  case  if  suscepti¬ 
ble  individuals  were  contracting  breast  cancer  and  being 
removed  from  the  population,  thus  increasing  the 
immune  fraction  in  the  surviving  population. 

Other  investigators  have  either  estimated  or  simply 
assumed  that  the  risk  of  breast  cancer  decreases  past  a 
certain  age.  As  previously  noted,  Moolgavkar  et  alf, 
found  the  hazard  function  to  decrease  sharply  starting 
sometime  past  the  age  of  70.  By  age  90,  the  risk  has 
decreased  to  about  1/3  of  the  peak.  Parmigiani  et  cil . 15 
fit  breast-cancer  incidence  data  from  Easton  et  al 16  to 
a  three  parameter  gamma  distribution.  Implicit  in  this 
fitting  procedure  is  the  assumption  that  the  risk  to  car¬ 
riers  of  BRCAI  and  BRCA2  decreases  to  zero  with  age. 
There  is  little  evidence  for  this  in  the  data  used  by 
Parmigiani  et  al .,  as  the  highest  age  is  70.  It  may  be 
important  for  further  efforts  at  risk  prediction  to  better 
understand  the  hazards  to  carriers  of  disease  suscepti¬ 
bility  genes,  particularly  at  more  advanced  ages,  where 
data  are  sparse. 
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